K (programming language)

From Bauman National Library
This page was last modified on 2 June 2016, at 09:06.
К
Paradigm array processing, functiona
Designed by Arthur Whitney
Developer Kx Systems
First appeared 1993
Typing discipline dynamic, strict
OS Linux, OS X, Windows
License commercial
Website None
Influenced by
A+, APL, Scheme

K is a proprietary array processing language developed by Arthur Whitney and commercialized by Kx Systems. Since then, an open-source implementation known as Kona has also been developed. The language, originally developed in 1993, is a variant of APL and contains elements of Scheme. Advocates of the language emphasize its speed, facility in handling arrays, and expressive syntax. The language serves as the foundation for kdb, an in-memory, column-based database, and other related financial products.

History

Before developing K, Arthur Whitney had worked extensively with APL, first at I. P. Sharp Associates alongside Ken Iverson and Roger Hui, and later at Morgan Stanley developing financial applications.

In 1993, Whitney left Morgan Stanley and developed the first version of the K language. At the same time he formed Kx Systems to commercialize the product and signed an exclusive contract with Union Bank of Switzerland (UBS). For the next four years he developed various financial and trading applications using K for UBS.

The contract ended in 1997 when UBS merged with Swiss Bank. In 1998, Kx Systems released kdb, a database built on K. kdb was an in-memory, column-oriented database and included ksql, a query language with a SQL-like syntax. Since then, a number of financial products have been developed with K and kdb. kdb/tick and kdb/taq were developed in 2001. kdb+, a 64-bit version of kdb was released in 2003 and kdb+/tick and kdb+/taq were released in 2004. kdb+ included Q, a language that merged the functionality of the underlying K language and ksql.

Owerview

One of the large draws of K is the extreme programmer productivity it offers, its incredibly fast execution speeds, and its very small executable units. Example of these claims met is how KDB outperforms Oracle on TPC benchmarks in both query speed and data storage size while essentially being written by one person. Also, K code is very dense, and it is typical to see 100:1 code size reduction when migrating to from C to K. I have heard of almost a 1000:1 reduction when a project moved from Java and SQL to K and KSQL.

K is an exceptional language for dealing with mathematical analysis, financial prediction, or anything that handles bulk data.

K has bindings to other popular language such as C, Java, VisualBasic, and Excel. There has also been work done on bindings to Python and Mozilla's XUL. K's builtin interprocess communication and binary format for objects is very simple and documented so making other systems interact with K is often equally simple.

Even though K is an interpreted language, the source code is somewhat compiled internally, but not into abstract machine code like Java or Python. The interpreter can be invoked interactively or non-interactively, and you can also compile a source file down to a binary file for product distribution, if you wish.

One of the hardest things for many people to get over at first is the way K looks. But there is almost no syntactic sugar or special cases. K is even translatable into English.

Operators are called verbs, and data is called nouns. There are also operators that modify other operators that are called adverbs. Stringing some nouns, verbs, and adverbs together will produce clauses and sentences. This dialogue has been inherited from APL and is often abandoned for the more commonplace names of operators, functions, and variables.

Comments and Conventions

Forward-slash (/) is used for comments when it comes at the beginning of a line or it has space to the left of it. K tends to favor simplicity over sugar. One thing that may confused people is K has no precedence rules. Everything is parsed from right to left. For example, 3*2+1 in K will produce 9, instead of expected 7. But you can use parentheses.

Values

There are four simple types of values in K

  • integer
  • floating point
  • character
  • symbol

A special null type for the singular value of _n and two composite types dictionaries and lists. The composite types are containers for other types. Lists are classified as homogenous(carring around type of containing value) or heterogeneous.

Examples:

  (1; 2.3; 4; 5.6)       / heterogeneous list of integers and floats
  1 2 3 4 5              / homogeneous list of integers
  1.2 3.4 5.6            / homogeneous list of floats
  "quack"                / homogeneous list of characters
  (1 2; 3.4 5.6; "meow") / a list of lists
  (1;2 3;4 5 6)          / a vector of integers

  "abcdefghijlkmnopqrstuvwxyz"[14 8 13 10] / => "oink"

  ("qwerty";"poiuy";"asdf";"jhgfdsa")[;3] / slicing or projection => "ruff"

Interpreter uses special expressions - "symbols" - for variable lookup. They are created by the backtick followed by any legal variable name:

`myvariable
`"variable_with_..complex_//allowed_name"

Verbs

To allow for compact code operators are overloaded with two cases: a monadic (one argument) and diadic (two argument) use. In each case the operator has slightly different behavior depending on the type or domain of the arguments.

Examples:

  !4       / enumerate: a list of integers from 0 to x-1 => 0 1 2 3

  5!3      / mod: the residue of the left modulus the right => 2

  2!1 2 3  / rotate: spins right back-to-front left number of positions => 3 1 2

  ,2       / enlist: a one item list containing only 2 => ,2

  1,2      / join: forms one list of the left and right argument => 1 2

  |1 2 3   / reverse: reverses a list => 3 2 1

  0|1      / max: the maximum also boolean OR => 1

  &1 2 3 4 / where: returns the number of units specified => 0 1 1 2 2 2 3 3 3 3

  &0 1 1 0 0 1 / where: an important use to get the indices of the 1s => 1 2 5

  0&1      / min: the minimum also boolean AND => 0

  4<5      / less-than: predicate of "is the left smaller than the right" => 1
  <7 4 9   / grade-up: sorts indices in ascending order => 1 0 2

  =1 0 1 0 0 3 0 1 3 1  / group: groups all indices of same value
                        / (0 2 7 9
                        /  1 3 4 6
                        /  5 8)

  2=0 1 2 3 4           / equals: compares values => 0 0 1 0 0

  ?1 0 1 0 0 3 0 1 3 1  / unique: all unique elements in order seen => 1 0 3

  1 0 1 0 0 3 0 1 3 1?3 / find: the first indice of the right in left => 5

Variables and Bindings

K is a dynamically, strongly typed language and variables are not declared, but they come into existence when you assign a value to it. This can be done anywhere, even in the middle of an expression since there is no distinction between statements or expressions. If you try to read a value from a variable that has not yet been assigned to, you will raise an error. There are also no pointers. In true functional style, when you assign to a variable a deep copy of the value is made (K does this lazily, though). Assignment is done via the colon and it is read as "gets" or "is." As a special case, when an assignment is the last thing in an expression, null is returned (this helps prevent cluttering up the display log). You can force the return of a value from an assignment statement by using a case of the monadic colon.

Examples:

  a:"moo"       / a gets the string "moo"
  b:!10         / b gets enumerate 10 (integer list from 0 to 9)
  :c:b          / c gets the value of b, but changes to b do not effect c => 0 1 2 3 4 5 6 7 8 9

  :h:(g*2),g:1+2 / h get g times 2 join g, where g gets 1 plus 2 => 6 3
  g              / => 3

User-defined Functions

Braces {} are used to create functions; they are the equivalent of lambda in Lisp. Often they are used then the resulting function is assigned to a variable, but sometimes not. To assist in making the code compact, if a function requires three or fewer arguments, K will allow you to implicitly use x, y, and z as the arguments. If you need more than three arguments you must declare them all. All functions return the value of the last executed statement, even if that statement return null. To call a function you use brackets (just like a list index). If you do not supply an argument when calling a function, it will project.

Examples:

  pyth:{_sqrt(x*x)+y*y} / notice the two implicit arguments
  pyth[30;40]           / => 50.0

  pyth[3 15;4 20]       / cute huh. => 5 25.0

  dist:{[x1;y1;x2;y2] _sqrt((x2-x1)^2)+(y2-y1)^2}
  dist[1;1;4;5]         / => 5.0

  :d:dist[1;1]          / project or curry the first two arguments
                        / {[x1;y1;x2;y2] _sqrt((x2-x1)^2)+(y2-y1)^2}[1;1]

  d[7;9]                / => 10.0

  :e:dist[1;;4]         / project on first and third argument
                        / {[x1;y1;x2;y2] _sqrt((x2-x1)^2)+(y2-y1)^2}[1;;4]

  e[2;6]                / => 5.0

  inc:1+
  inc 8                 / => 9

System Functions and Variables

After running out of punctuation Arthur made system function. Every symbol beginning with an underscore is reserved for either a system variable or system function. System functions use infix, like their less readable cousins, but like user defined functions they cannot be overloaded with monadic and dyadic cases (in the next version of K this will be changed and users will be able to define infix functions and overload them with n-adic cases).

Examples:

  3_draw 5        / list of 3 random numbers from 0 to 4 => 2 2 4

  2_draw 0        / list of 2 random real numbers from 0 to 1 => 0.2232866 0.9504653

  4_draw-4        / deal: list of 4 random nonrepeating numbers from 0 to 3 => 2 0 1 3

  4 13_draw-52    / deal a deck of cards into four piles
                  / (29 27 10 0 23 3 28 5 24 16 40 8 22
                  /  51 20 36 47 18 31 26 11 44 37 38 9 13
                  /  39 42 34 50 21 6 19 46 48 45 14 43 2
                  /  33 49 4 25 41 30 35 7 32 17 1 12 15)

  1 3 4 5 7 9_bin 4 / binary search through list returning index => 2

  1 3 4 5 7 9_binl 2 4 6 / binary seach for a list of numbers => 1 2 4

  16_vs 543212    / vector from scalar: changes base to 16 => 8 4 9 14 12

  5 3 2_vs 21     / also does variable change of base => 3 1 1

  5 3 2_sv 3 1 1  / scalar from vector: the inverse => 21

  _host`kuro5hin.org        / returns ip address as integer => -815566008

  256_vs _host`kuro5hin.org / presentation form => 207 99 115 72

Adverbs

You rarely write loops in K (KDB is 100% loop-free), instead you use adverbs. An adverb modifies a function, returning another function, changing the ways it operates over its arguments and what it does with it's return values. Here are some of adverbs' usages.

  • Over (/) modifies a diadic function and will apply the function down a list, collection the result.
  • Converge (/) modified a monadic function and will continually apply the function to the previous result until either the initial value or the result of the preceding value is returned.
  • Scan (\) will apply the function down a list, collection all intermediate results (this is sometimes called trace). There is a trace analog to all usages of over.
  • Each (') will apply the function down lists of the same length (equal to the valence of the function).
  • Each-right (/:) will hold the left argument of the function and apply the function down the list of right arguments.

Examples:

  +/1 2 3 4          / plus-over (sum): is similar to 1+2+3+4 => 10

  +\1 2 3 4          / plus-scan: returns the intermediate values of +/ => 1 3 6 10

  |/5 3 7 4 2        / max-over: compares all items like 5|3|7|4|2 => 7

  ,/(1 2;(3 4;5);6)  / join-over: (1 2),(3 4;5),6 => (1;2;3 4;5;6)

  ,//(1 2;(3 4;5);6) / flatten: explained below => 1 2 3 4 5 6

  3 4_draw'-3 -4     / draw-each: (3_draw-3),(4_draw-4)
                     / (1 2 0
                     /  2 0 1 3)

  2_draw/:10 100     / draw-right-each: (2_draw 10),(2_draw 100)
                     / (7 7
                     /  45 91)

Join-over joins without performing flatting of nested lists. Neither do flatten. Also you can use flatten-each (,//') and specify as many ' symbols as number of levels from top wich you don't want to flat.

Conditionals

Although rarely used, there are a few conditionals; most often used is the colon. It is similar to cond in Lisp: it takes pairs of arguments and an optional final argument. The first argument of each pair is tested for truth (0 is false, all other integers are true, anything besides an integer is an error). If it is true then the result of evaluating the second of the pair is returned. If it is false then the next pair is tested. If all the pairs have been exhausted, then the final argument is evaluated and the result returned. If there is no final argument and all the conditions are false, then null is returned.

Examples:

  :[0;"true";"false"]          / => "false"

  s:{:[x>0;"+"; x<0;"-"; "0"]} / returns the sign of x or 0
  s 4                          / => "+"
  s -3                         / => "-3"

Naive Primality Test

Primary predicate:

  isprime:{&amp;/x!/:2_!x}  / min over x mod right-each 2 drop enumerate x  isprime 14 / => 0

  isprime 7  / => 1

  !14 / list of all integers from 0 to x (exclusive)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 

  2_!14 / remove the first two elements
2 3 4 5 6 7 8 9 10 11 12 13

  14!/:2_!14 / residue of x and each of the numers in the list
  / => 0 2 2 4 2 0 6 5 4 3 2 1

  &amp;/14!/:2_!14 / number is prime when the lowest residue is 1 (considered true)
  / => 0

The K-tree

It falls nicely into the K philosophy of keeping things simple and powerful. The K-tree can be used for modularization of programs, as a scoping mechanism, for GUI design, and as a rudimentary object system.

All variables exist somewhere on the K-tree. To reference a variable you separate the name of the variable and each branch name with periods. The root of the tree is an empty symbol. For example, you might have a fully qualified variable named .branch.subbranch.variable.

A branch is really just a dictionary. If I were to assign a dictionary that contained symbols sym and bol to the variable .tr.ee then you would be able to access .tr.ee.sym and .tr.ee.bol. It goes the other way, too. If you were to create the variables .dict.ion and .dict.ary then the variable .dict would be a valid dictionary. This makes the language very reflective since you can now modify variables locations, scopes, and manually manipulate extents.

Whenever you are in the K environment you are always running in a branch of the K-tree. When the K interpreter starts it places you in the .k branch. You can move around the tree with the directory (\d) command. For example \d .tw.ig would create the new branch tw then create a subbranch ig. You can inspect all the variables in a branch with the list variables (\v) command. Often a script will begin with a change directory command and then define all of its variables in that (and maybe a few more) branches. This effectively uses the K-tree as a module system.

  \d .test          / create a new directory off the root
  \d ement          / create a sub-branch
  \d                / show the current directory => .test.ement

  new:`small        / put some values in the directory
  old:`big
  \v                / list the contents o => new old

  \d ^              / back up one directory in the tree
  \d                / => .test

  \v                / => ement

  ement             / inspect the value of ement
                    / .((`new;`small;)
                    /  (`old;`big;))

  .test.ement.new   / fully-qualified => `small

  ement.old         / partially qualified => `big

  ement`old         / index like an array => `big

Other Unique Elements of K

K uses the tree to hold an attribute structure. These attributes are used for documentation strings, GUI representations, for other functionality in K, and for whatever you decide to use them for. Some of the programming environments for K make entensive use of attributes to store information about where functions are defined and other administrative information.

K also has a the concept of dependencies and triggers. They are used for efficient, demand-based calculation of data and executing callback code whenever a value is changes (this is how the timer system in K works). K will keep track of out of date dependencies for you and only do the minimal amount of work necessary to update values.

K has a unique, declarative GUI subsystem that is based around the K tree and triggers. It takes a minimal amount of work to put a GUI on your application. K takes the approach that you show variables and changes made to that variable on the screen are immediately reflected in the program without any work by you.

K's interprocess communication (IPC) and network communication systems is also based around callbacks and message handlers. There are simple K primitives that will ship entire K data structures around for you, or you may do it yourself. The goal of the IPC system is like the goal with rest of K, make it fast, simple, powerful, and highly useful.

In a time when programming languages are lacking in originality and are not bringing new ideas to the table, K succeeds where others fail. But K is not just a research language, not appropriate for real-world use. While the community may be small some of the users of the language are very big. With recent implementations by Island ECN and the US government, this looks to only be getting bigger, too. The next version of the language will fix many of the nagging holes and annoyances and take away the line noise factor that has pushed many away.

Here are some closing simple segments of K that are informative on how the K way of approaching a problem may be different:

  cut:{1_'(&amp;x=*x)_ x:" ",x}
  cut "Scoop ate my spaces"
  / ("Scoop"
  / "ate"
  / "my"
  / "spaces")

  fibonacci:{x(|+\)\1 1}
  fibonacci 5
  / (1 1
  /  2 1
  /  3 2
  /  5 3
  /  8 5
  /  13 8)

  first:*:
  euclid:{first(|{y!x}\)\x,y} / Euclid's Algorithm
  euclid[24;40]
  / (24 40
  /  16 24
  /  8 16
  /  0 8)

K financial products

K is the foundation for a family of financial products. Kdb is an in-memory, column-based database with much of the same functionality of a relational database management system. The database supports SQL, (SQL-92) and ksql, a query language with a syntax similar to SQL and designed for column based queries and array analysis.

kdb is available for Solaris, Linux, OSX and Windows (32-bit or 64-bit).

External links