Don Orth
Kdb+ comes with a comprehensive programming language: q (.q scripts). [We will also implement SQL.] q is similar to SQL but is simpler and more expressive. q implements all relational capabilities as well as timeseries analysis. This manual is concerned with the q versions of the standard SQL operations on tables.
Additional features are defined in Q Programming.
q is a programming and runtime environment that starts from and runs in a shell, which is either a Unix/Linux shell or a Windows Command Prompt. After kdb+ is started, the shell becomes a q shell, or q console, in which expressions are entered and evaluated. The activity of entering and evaluating expressions in a q shell is called a q session; the entered expressions and their results are maintained in a q session log. Data created and assigned a name remains available during the current session. The data created during a session are collectively referred to as the workspace.
The simplest q startup command is simply . That is, type
alone on a shell command line and press [Enter]. When q starts up you will see a copyright notice and license information. The OS shell has now become a q shell with a blinking cursor indented 2 spaces. Type any expression and then press [Enter]. The expression you entered will be evaluated and the result (if any) will be displayed below typed, starting at the left side of the shell window. After the result display is complete you will see the blinking cursor again, indented 2 spaces, ready for the next input. For example:
The annotations on the right are q language comments. A q comment starts with a slash ( ). There must be whitespace to the left of the slash. This entry-result format is used throughout this manual to display examples. You can repeat these examples in your shell, or you can make up some of your own.
Assignment in q is denoted by colon. For example,
The value of a data object can be displayed simply by typing its name, as follows.
Note that there is no result display following the specification of w, even though specification has a result (the value given to the name). This is simply the way the q console is designed.
Errors are reported in the console by displaying the error type, generally type or length, the failed primitive function, and its argument(s). If you see the default prompt, you can simply continue with your work. If you see a leading ")" on the prompt, type "\" and press [Enter]. You should then see the default prompt. Debugging is discussed in the Q Programming manual.
Exit from the q console with
The atomic datatypes are the same as those in SQL3. Here is a list.
The boolean atoms are denoted 0b and 1b. A boolean occupies 1 byte. Int (4 bytes) atoms are written in the usual way, and float atoms (8 bytes) can be written in the usual decimal and exponential formats. Short (2 bytes) and long atoms (8 bytes) are written like ints but also have a trailing h or j. Real atoms (4 bytes) are written like floats or ints but with a trailing e.
Symbols are character strings preceded by back-quote, as in `abc. (Not all syms can be formed simply by putting back-quote in front a sequence of characters.) Syms are also called symbols and varchars.
Date atoms must be specified in the form . For example, 2003.03.23 for March 23, 2003 is correct; 2003.3.23 is not. Time values have millisecond resolution and are specified in the form . For example,
q also has month (yyyy.mm), minute (hh:mm) and second (hh:mm:ss) types.
The following table defines the fields of the temporal datatypes. Field values can be extracted from temporal data using field names and dot notation (Extracting Temporal Field Values). Given:
Atom functions are those that apply to atomic arguments and produce atomic results. The following is a list of the primitive q atomic functions.
Binary operations in mathematics are called dyadic functions in q, and unary operations are called monadic functions. For example, is a dyadic function and is a monadic function.
The dyadic arithmetic functions are +, - , * and % for Divide. Note that Divide is denoted by %, not /. Numeric datatypes can be mixed in arithmetic expressions and required conversions from one to another are automatic. The result of any arithmetic expression that contains % is of type float. The arguments of % are converted to float before the operation is performed. For example,
The result type is int if there are no occurrences of % and all the datatypes are boolean, byte, short or int. Everything is converted to int before the operations are performed. For example,
If long integers are included among the arguments and there are still no occurrences of % then the results are always of type long.
The Sum, Product and Difference of two bools are ints.
The dyadic relational functions are =, < and >. These functions produce boolean values, where 1b means that the relation holds, or is true. For example,
There is one monadic relational function, named . This function gives the relationship of all numeric values to zero: the result is 1b if the argument is identical to 0 and 0b if it is not. For example,
Less or Equal in q is , Greater or Equal is and Not Equal is .
The relational functions in q use a multiplicative tolerance when applied to floats. This makes arithmetic work better.
For example,
The comparison functions are | for Max and & for Min.
The arguments can be of any numeric types, or both can be chars, or both syms. When the numeric types are different the result is one of the types, always the one to which the other argument can be safely converted. For example,
The Max of 2 chars is the one with the highest byte order. For example,
The logical functions (And, Or and Logical Negation) do not have separate symbols; they are, respectively, the restrictions of Min, Max and Not to boolean arguments. For example,
The Floor and Absolute Value functions are useful companions to the arithmetic functions. Floor applies to floats and reals and produces their integer parts as ints. For example,
Floor is often used to truncate and round floats to a specified number of decimal places. For example, the first expression truncates the float 72.277 to 2 decimal places and the second expression rounds it to 2 decimal places.
Absolute Value leaves non-negative numbers unchanged and negates negative numbers. For example,
Match is not an atom function, but is related to Equal and is so useful when experimenting with the language that we'll define it here.
It often happens that you want to compare two results to determine whether or not they are identical. This is often done in this manual simply by looking at them, but that doesn't always work, e.g. when the console display doesn't contain all the details. Sometimes Equal can be used, but not always. The primitive function called Match is the function to use. For example,
Comparison tolerance is used when matching floats.
Match depends on the datatype of the arguments, not just the values. For example,
q is a rich orthogonal language and therefore has no precedence among functions. Expressions are written and read from left to right. Expressions are evaluated left OF right -- unlike SQL and conventional mathematical notation. For example,
is
is
Anyone who has gone to school in the last 150 years may find this surprisingly simple. Watch out for logical or's. For example,
is
Ints representing day counts can be added to and subtracted from dates. As a result, dates can be added to and subtracted from dates to give day counts. For example,
Ints representing milliseconds can be added to and subtracted from times. For example,
Floats can be added to and subtracted from datetime values. The integer part of a float represents a day count and the fractional part represents a fraction of a day. For example, 2 hours and a half can be added to a datetime as follows.
Dates and datetimes can be compared to one another, while times can only be compared to times.
The IEEE arithmetic NaN (not-a-number) for floats is a float denoted by . Plus-infinity and Minus-infinity for floats are denoted by and . For example:
The sum, difference and product of two ints is always an int, the modulo 232 value of the mathematical result. For example,
Analogously, the sum, difference and product of two longs or a long and an int is always a long, the modulo 264 value of the mathematical result.
Kdb+ has two distinct primitive datatypes to which q applies: tables and keyed tables. With regard to traditional RDBMS's treat both datatypes as tables except for the key check on inserts. Tables can be created functionally with q primitive functions or with the syntactic form . The syntactic form is used here.
Tables are collections of data whose items, or "columns", have names. Following the SQL convention, tables are created by naming the items and initializing the items as empty lists. For example, the following expression creates a table named t.
Let's look at the parts of this expression. Colon (:) denotes specification. The above expression defines a table named t with items named n, x and y. The parentheses pair denotes an empty list, which means that each of the table items is defined to be an empty list. The column n, which is defined within square brackets, is the primary key of the table. The columns x and y are the data columns. Note that the definitions of x and y are separated by a semicolon. If there had been more than one primary key then those definitions would also have been separated by semicolons. If there is nothing between the square brackets then there are no primary keys.
At this point the datatypes of the items are unspecified. They will be fixed the first time data is inserted in the table, which is done with the function named . For example,
The symbol holds the name of the table. In general, data objects in the q workspace can be referred to in two ways, simply by their names or by symbols whose contents are those names. In the case of insert, using the name instead of the symbol will produce a new table with the new data, while using the symbol causes the named object to be modified with the new data. The above insert statement modifies the table t, which now has one row. The datatypes of the items are now fixed. Here's another row of data.
Note that the inserted data (275) will be automatically converted to the datatype of the column (float). An error will occur if the conversion cannot be done.
It is possible to specify the datatypes of the items when the table is initialized. This is necessary for foreign keys; see Foreign Keys.
The insert function is a dyadic function, and as such is usually evaluated by an expression of the form
The alternative , is explained in Q Programming.
If you type the name of the table t you will see its value, but what you see in the console requires an explanation that is beyond this manual. Consequently we will use a more intuitive display format for tables in this manual. For example, the table t will be displayed as follows.
The display shows an input line with the name of the table followed by a display of the table, as if this is done in a console. The name and the table display are separated by a blank line to make the name stand out. The table display has a header line with the column names, which is separated from the data by a horizontal rule. A vertical line separates the primary key columns from the data columns. There will be no vertical line if there are no primary key columns.
You can see a similar table display in the q web viewer. To do this you must first start kdb+ as a web server. For example, the following startup command creates a q shell listening on the web port 5001.
If you are working through this manual at a console, first exit and restart as a web server. Next, define the table t and insert a record, as above. Finally, connect to the server from a browser by entering the following in the browser address area.
You will then see a browser display of table t like the one above. You should also see a list of tables to the left of the display of t, although at this point t is the only table name in the list. When there is more than one table, simply click on any table name to view its contents.
The tables used in this manual for q examples are defined in the q distribution script sp.q. There are 3 tables, s (suppliers), p (parts) and sp (shipments). They are based on the supplier-parts sample database in A Guide to THE SQL STANDARD, Fourth Edition by C.J. Date with Hugh Darwen. You can load these tables into a q session (and create a web server as well) with the startup command
You can display the tables in a web browser by restarting the q session with this command and reconnecting from a browser with the above URL. Here are the tables.
Note that the supplier table and the parts table have primary keys. The shipments table has two foreign keys, although the display doesn't show it. These tables were initialized by the following expressions and filled in with the insert function; the insert expressions are not shown here.
The last expression contains something new. Columns and of are foreign keys; see Foreign Keys below.
Another table that is useful in examples is the following security trade table. This table is not defined in the sp.q script. You can cut-and-paste the following expressions into a q console to create and populate the table.
You can also cut and paste these lines into a text file to be saved as the script trade.q. The file can then be loaded into a q console with
The trade table is a simulation of financial data for securities trading. It has the same format as the daily trade tables in the kdb+taq historical database product and the taq ticker plant in the kdb+tick realtime tickerplant product. The items of the column named are sym atoms holding simulated ticker symbols, such as MSFT for Microsoft and GE for General Electric. For example, the data in the fourth row of the trade table is the following list.
The data indicates that a trade was made at 2 seconds after 9:30AM for 2200 shares of the stock with ticker symbol aaa at the price of $96.25 per share.
A foreign key defines a mapping from the rows of the table in which it is defined to the rows of the table with the corresponding primary key. Foreign keys in SQL provide referential integrity. Namely, an attempt to insert a foreign key value that is not in the primary key will fail. This is also true in q. For example, the following insert into the shipments table will fail because s0 is not a defined supplier.
An entry for s0 must first be inserted in the suppliers table. In SQL one says that a foreign key represents a reference from its host table to the primary key table; hence the term referential integrity. In q the reference is actually a mapping from one table to the other; it may be more accurate to use the term domain integrity here.
The foreign key expression in the definition of sp, which denotes the mapping of sp to s, is called an enumeration. The symbol `s in the expression refers to the suppiers table. You can rename column in the suppliers table or column in the table and the expression remains the same. However, if you change the name of the suppliers table to, say suppliers, then the foreign key expression must be changed to .
The mapping represented by a foreign key defines each column of the corresponding primary key table as a so-called virtual column in the foreign key table. For example, the color column of the parts table is a virtual column of shipments table , as the following display indicates.
The first 3 columns are identical to the shipments table. You can see that wherever two items of the parts column are identical, the corresponding items of the colors column are identical. For example, according to the suppliers table the part p1 is red, and red appears in the above color column wherever p1 is in the column. q has a simple mechanism for specifying virtual columns in queries; see Dot Notation.
Select expressions are q expressions that create new tables from existing ones. In their simplest form, select expressions extract subtables. However, it is also possible for them to create new columns. Select expressions have the following general form.
The subexpression following the keyword is called the phrase. Analogously, there is a phrase following the keyword ( ),a expression following the keyword ( ) and a phrase following the keyword ( ). Only the expression is required. The result of a select expression is a table.
The phrase is a list of conditional expressions that applies to and defines the rows that contribute to the result. If there is no phrase then all rows of are used. The phrase defines the data columns of the result. If there is no phrase then the result has the same column names as .
The phrase defines how the phrase columns are aggregated; see Aggregation. The columns named in the phrase are the primary key columns of the result. If there is no phrase then there are no aggregations and no primary key columns in the result. Note that unlike SQL, the columns named in the phrase must not be repeated in the phrase.
Foreign keys in the phrase become foreign keys in the result unless their values are modified.
1. Sub-tables of specific columns can be selected.
2. Specific rows and columns can be selected.
3. Specific rows can be eliminated.
4. The result is not necessarily a sub-table of the one named in the espression; result columns can be computed or renamed.
5. The following example is a reminder of the q execution rules (see Order of Evaluation). The parentheses in the following phrase are required.
You can evaluate a query in a browser by entering the expression in the address area after the host:port designation. For example, enter
and you will see the following result.
If you are familiar with SQL then you have seen dot notation in that context, where for example, denotes column of the shipments table . This notation allows references to multiple tables in a single query. It has a similar meaning in q, but there is also another use of the dot notation in q. Namely, if b is a foreign key in table t and c is any column in the corresponding primary key table w, then is the virtual column of c in t defined by the foreign key mapping from t to w. The example in Virtual Columns shows the color column of table as a virtual column of table . The table in that example can be produced using dot notation as follows.
This use of dot notation is analogous to SQL natural joins. For example, the above result is also produced by the SQL query
In the SQL query the p in p.color refers to the table p; in q it refers to the foreign key p in the table sp. The script sp.q has several examples of q queries using q dot notation and equivalent SQL natural joins.
Just as every column of table p defines a virtual column of sp, so does every column of table s. For example,
As before, equal values in the result column s correspond to equal values in the status column.
Temporal field values are extracted from temporal values by using dot notation, as in data.field. For example,
The underlying concept of temporal dot notation and foreign key dot notation are the same. Foreign key dot notation gives access to every column in the corresponding primary key table as a virtual column. Likewise, temporal dot notation gives access to every field of a temporal column as a virtual column. For example, the following query accesses the seconds virtual column of the trade table time column.
The boolean primitive functions discussed in this section broaden the range of "where" phrases. All three functions are dyadic and produce boolean results.
All three are list functions, not atom functions, and therefore we begin this section with more about lists.
We have already seen lists and list notation in the formation of row data to be inserted in a table with the insert function (Relational Tables and the Insert Function). For example, a list of an int atom, boolean atom and symbol atom is expressed as follows.
There are simpler forms. When the items are all atoms of the same datatype, the list is called a simple list. For example,
is a simple int list,
is a simple boolean list, and
is a simple symbol list. Note that the items in a simple int list must be separated by at least one space, but there can be no spaces between the items of a simple sym or boolean list.
Every atomic datatype has a simple list form. Simple lists of a temporal type require spaces between items. Simple char and byte lists, like simple boolean lists, allow no spaces between items. For example,
and
Simple short, long, real and float lists are like int lists, with items separated by at least one space. In general, the type of the last item determines the type of a simple, numeric list, as in the above boolean example.
A simple float list is a list of int and float numbers where only one item is in float format, and that item can be any position. For example, is a simple float list. The following display shows its entry in a q console, followed by the response.
Here is a summary of simple list types.
Atom functions apply to lists in an item-by-item manner. For example,
In the first example, two lists of the same length are summed item-by-item. In the second example, an atom is summed with each item of a list.
These rules govern the way atom functions apply to table columns within q expressions.
For example, in the phrase
each item of the size column is compared to 2000.
Table columns are lists; in this manual they are simple lists, but this is not a requirement, as you will see in Q Programming. For the remainder of this manual, we'll assume that lists are simple lists unless stated otherwise. For example, we will say that in an int list, meaning a simple int list.
The function named determines whether or not each item of the left argument list is among the items of the right argument list. For example,
That is, both 1 and 4 from the left argument are also in the right argument; the other items of the left argument are not. Note that the number of items in the boolean result always equals the number of items in the left argument. Moreover, an item in the result depends only on the corresponding item in the left argument (and the entire right argument). Consequently, this function is said to be an atom function of the left argument. All three functions in this section are atom functions of their left arguments and list functions of their right arguments.
The expression in the phrase of example 5 in Select Examples can be rewritten in terms of the function as .
The function is strict about the datatypes of its arguments; the datatypes of simple list arguments must be the same.
The right argument of this primitive function is always a two-item list. The result is a boolean list with the same number of items as the left argument. The result indicates whether or not each item of the left argument is within the bounds defined by the right argument. That is, if the right argument is and c is the ith item of the left argument, then the ith item of the result is 1b if both a is less or equal c and c is less or equal b. For example,
Items 3, 6 and 4 are within the interval 2 6, but items 1 and 10 are not. This function is not as strict about the datatypes of its arguments as the function. For example, if both arguments are numeric then they need not have the same datatype.
The function also applies to chars and syms because both are ordered (see Comparison Functions. For example,
and
The primitive function is the q pattern-matching primitive. Patterns in text are expressed by using certain reserved characters that have special meanings. The right argument of is a char list holding the pattern. The left argument can be a symbol atom, a char list, or any list of these two. The result is a boolean list with 1b for every item in the left argument that matches the pattern, and 0b otherwise. The following examples use a list of char lists because we have not yet discussed how to create complex syms. The left argument in the following example is a list of a few telephone book entries.
The reserved pattern-matching characters are "?", "*", the pair "[]", and "^" within square brackets. The reserved character "?" represents an arbitrary character in the pattern. The reserved character "*" represents an arbitrary sequence of characters in the pattern. For example, the pattern "Smith*" will locate all entries whose last name begins with Smith. The pattern "Sm?th*" will locate all four entries.
The reserved characters "[]" are used in pairs to list alternatives. For example, the pattern "Sm?th*" would select all entries in a more extensive list that contains the last name Smothers or Smathers. However, by using the alternative pattern "[iy]" in place of "?", we can restrict the selection to names that start with Smith or Smyth.
We can try finding everyone with the telephone exchange code 321 as follows.
Unfortunately, this pattern also picks up the item for Ken Smythe, who has "321-" as part of his address. Since the exchange code is part of a telphone number the "-" must be followed by a digit, which can be expressed by the pattern . There is a shorthand for long sequences of alternatives, which in this case is .
Other sequences for which this shorthand works are sequences of alphabetic characters (in alphabetic order). The pattern in the last example isn't foolproof. We would also have picked up Ken Smythe's item if his street number had been 321-1a instead of 321-a. Since the telephone number comes at the end of the text, we could repeat the above alternative four times and leave out the final "*", indicating that there are four digits are at the end of each item.
Unfortunately again, this pattern misses the last item, which has an error in the last position of the telephone number. However, in this case the simpler pattern will work. It is generally best to not over-specify the pattern constraint.
Finally, the reserved character "^" is used to select characters that are not among the specified alternatives. For example, there are errors in some items where the last position in the telephone number is not a digit. We can locate all those errors as follows.
Bin is a verb that can be used in place of In for left arguments that are simple lists of items in increasing order. Bin uses a binary search algorithm, which is generally more efficient than the linear search algorithm used by In. The result is also more useful when an item of the right argument is not among the items of the left argument.
The items of the left argument must be unique. The right argument can be either an atom or simple list of the same type as the left argument. Otherwise, a type error is reported. In the case
for an atom y, r is an int atom whose value is either a valid index of x or -1. Specifically, the result r is the int i if y is either equal to or strictly between and . Otherwise, the result is -1 if . For example,
Bin is a atom function of its right argument. That is, if the right argument is a simple list then so is the result r, and equals for every index j of y.
Note that Bin expects the left argument to be in sort order, but does not enforce it.
Consider the following two queries.
The essential difference is that & in the first phrase is replaced by comma in the second. In this context comma is also an expression separator, so that there is no need for parentheses around the sub-expression . These two queries produce the same result, but in ways that can be significantly different with regard to execution efficiency for large tables.
We know how the the first query works. Logically, the phrase is evaluated to give a boolean list and the phrase extracts the rows of the trade table corresponding to 1b's in that list. The second query works differently. Logically, the following query is evaluated first to give a temporary table.
The evaluation requires evaluating the phrase on the size column of the trade table. The original query is then completed by applying the following query to the temporary table.
This evaluation requires evaluating the phrase on the sym column of the temporary table.
The two queries can be put together in one nested query, which is the reason the phrase in the second of the above queries is called a nested phrase.
There are two significant differences in the way the above queries are evaluated. First of all, the expression is evaluated on the entire trade table in the first query but on the smaller (often much smaller) temporary table in the second query. Secondly, the & in the expression is not evaluated at all in the second query. The greatest performance improvement comes from the first of the two differences in evaluation. In practice, it can be two orders of magnitude.
Since & is commutative, the order of the expressions in the nested where phrase doesn't affect the end result. That is, the query with the nested phrase could just as well be the following one.
When this query is evaluated the temporary table will consist of all rows of the trade table for the securities aaa and xx and the phrase will be applied to that table.
If possible, you should choose an order for the expressions in a nested phrase that produces the smallest temporary tables at each step. This can't always be done, but perhaps in this case it can. In a realistic trade table the trading activity on indivdual securities is generally known, as is the average size of trades. For example, if the securities aaa and xx are lightly traded it would most likely be best to put the phrase on the left because the temporary table will be relatively small.
The arrangement of expressions in a nested phrase for optimal performance is not always so obvious. There is more on this topic in Q Programming.
There is an important example where the order of the expressions in nested phrases makes a truly significant difference. The kdb+tick product from Kx Systems, Inc. creates historical databases of trades and quotes from various security exchanges. Each database is a parallel database stored in separate segments, with one segment for each trading date. Queries are applied separately to the segments and the results are combined to give final results. A typical query applies to some date range, not the entire history. All such queries can use nested where phrases in which the first expression defines the date range. In that way, the remaining evaluations are applied only to the segments for those dates instead of the entire history.
Aggregation functions, also called summarization functions, apply to lists of atoms and produce atoms. For example,
(The sum function converts its argument to floats before summing.)
Select statements with aggregation functions in their select phrases usually also have by phrases. The simplest phrase is the name of a single column. For example,
The sym column of the result, which is the primary key, has one instance of each unique ticker symbol in the trade table. Each item of the volume column is the volume of all trades for the corresponding ticker symbol in the sym column. (We say that the volume calculation is grouped by ticker symbols). For example, the aggregation result says the the volume of all trades for ticker symbol aaa is 5800 shares. That value can also be computed as follows.
Unlike SQL, the columns (sym in this case) automatically appear in the result and therefore should not be included in the phrase. The result of an aggregation is always a key table with the "by columns" as the primary key columns.
There can be more than one column in the phrase; multiple columns are separated by commas. For example, the following query calculates the volume for each ticker symbol during each hour of trading.
Note that the result has a dual primary key, columns sym and hh. Even though the items in these columns are not unique, pairs of the items in the same rows of the two columns are unique.
The columns in a phrase are not necessarily the ones in the table. In the preceding example the by column is a virtual column. By columns can also be computed columns; see The xbar Function.
The basic aggregation functions are the ones that have been part of SQL from the beginning (or nearly so). Here is a list.
The max and min functions are useful in financial calculations, giving high and low prices over specified time intervals. The time interval in the above examples is the time span of the trading day, but it can also be specified by including the time column in the phrase. See the above trade volume example. Here is another example.
There are two companion functions in finance that give the open and close prices. q has aggregation functions for those calculations but SQL does not, which brings up a fundamental difference between kdb+ tables and standard SQL tables.
Tables in standard SQL are not ordered. That is, there is no concept of one table row coming before another. It is possible to define a cursor on a result set and then manipulate the cursor rows in order, but that can't be done in the SQL query language. kdb+ tables are ordered. In particular there is the concept of the first row and last row of kdb+ table. The aggregation functions named first and last give open and close prices in financial calculations. For example,
Every table has a virtual column named i which holds the indices the rows, starting at 0. For example,
Note that plays the role of in SQL. The count function is special because it gives the same result when applied to any column of a table. This function does not depend on the values of the items in a column, only on the number of items. Consequently, SQL has a special construct that is used in queries whenever a row count is needed. The comparable thing in q is , although it only a convention here, not a requirement.
The aggregation function has fewer applications than , but there is an interesting example in Joe Celko's SQL for Smarties (2nd edition, Morgan Kaufman Publishers). Given a table named portfolio with columns id (security id), date and rate (daily rate of return), calculate the rate of return for each portfolio over a date range. The author points out that the query to accomplish this is essentially, in q
However, the author then goes on to other solutions because there is no aggregation function in SQL.
The weighted average aggregation function produces the average of the items of its right argument weighted by the items of its left argument. For example:
The financial analytic called vwap is a weighted average (volume-weighted-average-price). For example:
Interval bars are prominent in aggregation queries. For example, to roll-up prices and sizes in 10 minute bars:
Or suppose you want to calculate the number of securities with prices in each $5 increment. The following calculation using the Floor primitive (Floor and Absolute Value) determines the bars for a given price list. The following example of a bar calculation with Floor uses the price column from the trade table.
The calculation based on floor can be used in an aggregation query.
Note that q permits expressions in "by phrases". Xbar applies to all numeric and temporal datatypes and produces a result of the same type.
It is possible to insert more than one row at a time. This is called a bulk insert. To do this, list d is replaced with a table having the same column names as t. For example,
The keywords select and update define q expressions which execute in a certain order. The general form is:
The content represented by an elipsis mark is called a phrase. Usually the word "phrase" is preceded by the keyword immediately to the left of the phrase, as in phrase and phrase. Only the keyword is required in these expressions. The expression defines the table to which the rest of a expression is applied and can be another q expression. If it is itself a select expression it must have surrounding parentheses if there is a phrase that belongs to the top-level expression. For example,
The expression is always evaluated first, followed by the phrase. Next comes the phrase, and finally the phrase.
A , or phrase is a list of expressions separated by commas, as in or . As the examples show, the sub-phrases do not require surrounding parentheses. The sub-phrases are evaluated left to right. For example, if the phrase is then is evaluated first and applied to the "from table", and then is evaluated and applied. Phrases and sub-phrases are evaluated by the rules described in Order of Evaluation.
The primitives in this section are the standard SQL set operation primitives. The following tables t1 and and t2 are used in examples.
The Distinct function can be used in aggregations in the same way as in SQL. The following example compares a aggregation with a aggregation of the sp table.
In SQL, the DISTINCT keyword is permitted in the phrase, while in q, the Distinct function can be applied to any table to remove duplicate rows. For example,
These primitives also apply to lists, not only tables. Some simple examples are the following.