Kymata documentation

The support documentation for the Kymata Atlas aims to cover the theoretical framework behind the atlas, as well as guidance for new users.

The Kymata Atlas is a database of human brain function. While the atlas aims to be as easy-to-use as possible, understanding its contents is often difficult for new users. With this in mind, we recommend that new users read the 'background' section before embarking on using the atlas itself.

Background

One of the biggest problems new users of Kymata face is that of definitions. It is very easy to talk about the brain 'processing' or 'computing' information – but what does this actually mean?

We start with a thought experiment. If any of us wanted to describe the human brain, we would probably start with descriptions like the following:

- "The human brain weighs 1.4kg."
- "The human brain stores 80% of incoming audio information into short-term memory."

While these might seem reasonable statements, the language used is actually quite vague. Is the weight an average of all human brains? Is this mean or median average? Does weight change depend on age or gender? What do we mean by short-term memory, and what does ‘stores’ mean in this context?

If we want to describe the brain precisely, then all these descriptions, and others like them, will have to be written in a language that is completely unambiguous: the language of mathematics.

But which mathematical language to use? Every mathematical language has an 'expressive power', that is, the range of mathematical concepts the language can describe. When describing a complex object such as the brain, we will want to use the mathematical language with the greatest expressive power, which will give us the greatest likelihood of being able to describe it accurately.

Prior to 1930, the ability of mathematicians to determine the expressive power of any particular mathematical language was limited. While a large number of different mathematical languages existed, it was difficult to determine whether one was more powerful than another. In 1936 this changed; a number of mathematicians, working independently, constructed mathematical languages whose power could be determined precisely.

The first of these, λ-calculus, could describe a very large number of concepts. But the creator, Alonzo Church, demonstrated that there existed concepts that it couldn’t accurately describe. This means that it is not all powerful:

But what happened next was somewhat of a surprise. Each of the other powerful mathematical languages developed around the same time turned out to be exactly as powerful as λ-calculus.

Indeed, all mathematical languages developed since have been shown to have expressive power either equivalent to, or weaker than, these 1936 languages. This includes all modern programming languages, including Python, C and R, which are of equal power.

As a result, any statement written in one of these languages can be translated perfectly into any of the others with no loss of meaning. In other words, if I write something in λ-calculus, there is a statement in Python that means exactly the same thing, and vice versa.

Any language that has this property is referred to as a Turing-equivalent language.

These developments have an important consequence for the natural sciences. If you want to accurately describe a feature of the world using a mathematical language – fluid dynamics, the trajectory of planets, or the behaviour of insects – the language you choose should be the most powerful available; that is to say, one that is Turing-equivalent. But it does not matter which one. Some may be easier to read, others may be easier to learn, but as they are all are equivalent, any of them can be used.

The same logic extends to neuroscience. When we hypothesise statements about the brain – from observations about its physical appearance to how it processes stimuli – we need to write these statements in a Turing-equivalent language.

The Kymata Atlas follows this logic, and all descriptions in the atlas are written in a Turing-equivalent language.

What is the Kymata Atlas?

The Kymata Atlas is a list of statements about the brain, written in a Turing-equivalent language. Put another way, it is a list of precise hypotheses about what computations the brain is performing, together with comparative evidence for those hypotheses. Most of these statements refer to how information in the human brain is manipulated and stored. The Atlas is not exhaustive. It aims to map the processes of the human brain with an emphasis on transparency, accuracy and statistical rigour.

How do we test statements found in the Kymata Atlas?

Statements about the world are often tested using the scientific method, in the hope of assigning some form of plausibility to them, where 0 is impossible and 1 is certain (Jaynes, 2002):

- All men are mortal = 1
- It will rain tomorrow = 0.5
- There exists a married bachelor = 0

The Kymata Atlas ‘statement list’ conform to the same format. Take these two statements as an example: (translated here from a Turing-equivalent language into the English language, which makes them easier to read but reduces their precision).

Statement	Plausibility
Sound information undergoes transform X, with the output of this transform stored in the dendritic current of layer V neurons at location [21, -5, 16], with this transform taking 45ms duration.	0.96
Sound information undergoes transform Y, with the output of this transform stored in the dendritic current of layer V neurons at location [21, -5, 16], with this transform taking 45ms duration.	0.02

The current list of statements maintained in Kymata, and their plausibilities, can be found here (a star next to the statement, known as a "function" in Kymata, indicates it has a plausibility close to 1). An API, giving users access to the list in JSON format, can be found here.

The statements in the list, and their plausibilities, are regularly updated. For details of how these plausibilities are calculated, see Thwaites et al (2015).

For most users, interpreting a list of mathematical statements can be pretty hard going. This is why we often use 'processing pathways' to visualise them – the subject of the next section.

Interpreting the atlas

Understanding processing pathways

Statements about the brain – especially those concerning the way information is transformed – can be quite complicated. In the Kymata Atlas, we try to make the interpretation of such statements easier by visualising such statements as 'pathways'.

In order to picture this, it is helpful to consider an example. Let us imagine that we are describing the processing inside a computer. By design, computers act according to specific Turing-equivalent statements so as to make their behaviour predicable - this makes describing computers easier than describing the brain, which is why we will start with them as an example.

Let us imagine that we are told a computer has been designed with the following list of statements, each of which has the plausibility of 1.

Here, KEYBOARD_INPUT stands for the word typed at the keyboard (the input), and SCREEN_OUTPUT is the value printed out to the screen. Each instruction consist of a variable a, b, c and d and a function get_input(), the transforming part of the instruction. The first statement, each_of_the_following_statements_takes_place_sequentially(), tells us that each of these statements will be done in order, from top to bottom.

As each of these statements has a plausibility of 1, the behaviour of the computer will be that it takes a word given through the keyboard, and prints 'yes' onto the screen if the first and last letters are the same, and 'no' if they are not.

These statements could be more complicated. For instance, each variable is stored in a specific location in the computer. There is no reason why this location information cannot also be in each statement:

This list of statements is a richer one, but it has become quite difficult to read. Let’s imagine someone had given us this list, and said that each of the statements has a plausibility of 1. We can make these statements easier to read by adding the location of each variable as an interactive label (hold your mouse over each of the variables to view their location):

This is much better. The new list still tells us the locations of the variables, as before when they were written out in full. Because we can translate between both formats with no loss of ambiguity, this new format is, in effect, a new Turing-equivalent language. But can we make it even easier to read? Consider the following:

Here the same set of statements is expressed as a flow diagram. The variables are still one after the other (left to right), but the functions are now represented by arrows. The first statement – that these statements are carried out one by one, in order — is not necessary to include in the diagram, as it is implied by the direction of the arrows.

We call this representation of the computer's set of instructions the information processing pathways map. This representation emphasises that the values in b and c are both a consequence of a, and their values affect d, and the values of b and c cannot influence each other. Again, this is exactly the same set of statements as before, just in a different visual format — again, another Turing-equivalent language.

We can take this idea further. Let’s imagine a situation where, instead of a statement at the top of the list that defines the ordering of the other statements, we were given the explicit timings at which each of these statements is carried out. We are told that the input enters the system at 0 ms, and that the subsequent ordering is expressed as latencies relative to when the information entered the system. Each instruction happens after another, so the latencies must get higher as we go down the list. For computers, this latency is measured in milliseconds. In our ‘list of instructions’ representation, this would look like this:

But we can simplify these statements to the following, by putting these latencies down the side:

In the processing pathways representation, the latencies can be marked in like this:

The latency is on the horizontal axis. As the cursor is moved left and right over the map, the timer displays the latency at which the variable is transformed. The latency is 0 ms when the input enters the system, and it takes 48 ms for the answer of the subsequent calculation to reach the SCREEN_OUTPUT. We have also added the physical locations of KEYBOARD_INPUT and SCREEN_OUTPUT to the map, in the same way we did for variables.

This pathways visualisation is still a Turing-equivalent representation of the computer, but has condensed function, time and location information into an easy-to-read format. In Kymata, statements in this pathways format in the information processing pathways graph (only the plausible pathways are shown).

Processing Pathways in Kymata

Let’s imagine you open up the pathways graph in Kymata and see the following pathways graph:

0ms

latency offset from the environment

This tells us there exists a processing pathway between the auditory receptors in the left ear and a Heschl’s gyrus, a region in the brain. The graph also tells us that the transform of the information between these two points, characterised as a function, is called calculate_hilbert_envelope(). By clicking on the function name, Kymata will give us details of this function, as well as the author of this function (in this case the mathematician David Hilbert) and where to find more detailed information about this function (in this case, one of Hilbert's books). Although the calculate_hilbert_envelope() equation looks a bit complicated, it means that the brain calculates a value similar to the 'loudness' of the sound entering the ear.

The graph also tells us that it takes 27 ms for this transform to take place, with the output being ‘saved’ to Heschl’s gyrus at 27 ms latency, relative to the sound entering the ear. 'Saved', or ‘stored’ of course, are terms we normally associate with computers, but they mean a similar thing here: that the output of this function is being written to a physical medium in the brain, namely the postsynaptic apical dendritic current in layer V pyramidal neurons. Kymata refers to this phenomenon as expression or entrainment, but these are just different terms for the same thing.

We often write the name of the function with a monospaced slab serif typeface and with brackets to remind us that it is a function. Although we have given these function names (calculate_hilbert_envelope() is so named because we are calculating the Hilbert envelope), naming conventions are arbitrary. See Function naming conventions for more information.

Documentation > Using Kymata

Quick-start Tutorial

Please read the ' General > Processing Pathways' section of the documentation before continuing.

...

Processing Pathways Viewer

Please read the ' General > Processing Pathways' section of the documentation before reading this section.

...

Surface Viewer

The surface viewer allows the user to visualise the location of a function's expression on the cortex, with sources arranged as tessellating hexagons. Latency is expressed present as time, which can be started and paused using the button in the top-right hand corner. The current latency is displayed on the bottom left of the screen.

The expression plot, accessible using the tab at the bottom of the viewer, allows the user to see the degree of expression on the cortical expression for a function, and this expression's latency. The plot shows all sources in the cortex (the left hemisphere on the top and the right hemisphere on the bottom). The further away from the centre the source, the stronger the evidence of expression for that function for that source. Strong expression, passing a high threshold, is coloured a darker grey, turning to coloured if the latency is over it. The current latency is marked by a triangle, and can be moved with the cursor or keyboard arrows.

By default, the expression for the instantaneous_loudness() function (KID: qrlfe) is displayed. The expression for other functions can be loaded from the function browser.

The viewing options, found in the top left hand corner, allow the user to alter various viewing options, including axis display, which hemispheres to display and which surface (pial, grey/white-matter boundary or midline) to project the expression to. The cortical surface can be expanded by using the slider in the bottom left corner of the screen.

Hovering the cursor over a source on the cortical surface will reveal that point's MNI coordinate, hemisphere and anatomical region. The anatomical regions (also known as 'cortical labels') displayed in Kymata are those of the Desikan-Kilkenny-Tourville-40 cortical labelling atlas ^{[ 6, 7]}, which can be cited as:

A. Klein, J. Tourville (2012) "101 labeled brain images and a consistent human cortical labeling protocol" Frontiers in Brain Imaging Methods. 6:171. DOI: 10.3389/fnins.2012.00171

Hovering over a location also highlights the corresponding point in the expression plot.

Function Browser

The function browser can be accessed from the surface viewer by clicking 'function browser' in the top right-hand corner. The browser lists the entire set of functions tested in Kymata, in alphabetical order of the function names. Each entry has a selection of meta-data associated for that function available for view, including a brief description, the authors, and a basic characterisation of the function itself.

You can search or filter the results using the search bar. Filtering options include the ability to filter by input stream, or by whether the function shows any significant expression.

To view the expression for a function, select the arrow to the right of it's entry.

Citing

Citing the Kymata Atlas (as an entity)

This format should be used when you want to cite the atlas as an entity. ie.

... and I found the documentation of the Kymata Atlas (Kymata Atlas, 2016) to be adequate, but not inspirational.

A Kymata Atlas data descriptor is not yet available, please cite the website:

The Kymata Atlas (2016) The Kymata Atlas homepage, https://kymata.org (retrieved 20-08-16)

Citing Kymata pathways and expression data

Warning This atlas is still in beta. We do not recommend citing data in Kymata unless you have spoken to support staff first.

This format should be used when you want to cite the pathways or expression data. Significant expression in Kymata refers to the entrainment of an output of a function at a particular location, 'L', and, by implication, to the existence of the function itself. This does not mean that the function exists at L, as the function characterises the transformation of information between two locations. So

CIELAB A* expression has been reported at MNI co-ordinate (3, -11, 23) at a latency of 45ms (Kymata Atlas, 2016).

and

Existence of the CIELAB A* function is supported by its expression at MNI co-ordinate (3, -11, 23) at a latency of 45ms (Kymata Atlas, 2016).

are both correct, but

The CIELAB A* function has been reported at MNI co-ordinate (3, -11, 23) at a latency of 45ms (Kymata Atlas, 2016).

is not. The associated reference might look something like this:

Kymata Atlas (2016) "Expression for CIELAB A* [KID:UYBPJ, Dataset 3.01]" Kymata Atlas; Cambridge University. https://kid.kymata.org/UYBPJ/latest

As well as the function name and KID, the reference also contains the dataset (see datasets section). The name of the dataset currently loaded in the viewer is displayed below the viewer and above the 'how to cite' button.

Also contained in the reference is an URL to the function's expression in Kymata (see sharing section).

For reasons of clarity or changed language use, the co-coordinators of Kymata will occasionally re-name functions, potentially leading to confusion (see the naming conventions section for more details). It is thus a good idea to reference a function's KID (unique identifier) at least once in your text. eg.

CIELAB A* [Kymata ID: UYBPJ] expression has been reported at MNI co-ordinate (3, -11, 23) at a latency of 45ms (Kymata Atlas, 2016).

A function's KID will never change, so by citing the KID you ensure that your readers will always be clear about the function you are referring to.

A list of journal articles containing KIDs can be found on the citations page.

In some instances, you may want to refer to the journal article where the function under discussion was developed, so that readers know where to look if they want to know more about the function itself. In this situation, the relevant journal/author information can be found in the information bar of the Kymata viewing pane, and this can be referenced in your text in the normal way. eg.

The instantaneous loudness function (Kymata ID: QRLFE; Moore, Glasberg & Baer, 1997) has long been suggested as the mechanism by which ...

Citing the Kymata Measurement Datasets

The correct citation for each dataset is given in the dataset's readme file, which can be found on the datasets page.

If you are looking for access to the raw electromagnetic measurements used to generate Kymata's pathways, please see the datasets section.

There are two main ways to share pathways information in Kymata. The first is the ability to create permalinks to pathways generated from specific datasets. Clicking on the 'share' button below the viewer will allow you to generate these.

Warning The ability to share the expression of a specific dataset is still in development, and users are currently only able to reference the latest dataset.

More generally, functions in the latest dataset can be linked to directly by using the adressing convention:

https://kid.kymata.org/<KID>

https://kid.kymata.org/<KID>/latest

where <KID> is the function's KID. In addition to permalinks, we make all our pathways accessible via the API.

API

The Application Programming Interface [API] gives users access to the processing pathway graph and supplementary information in .JSON format.

Pathways data can be found at:

https://kymata.org/api/pathways

Information about each of the hypothesised functions can be found at:

https://kymata.org/api/functions

Information about one specific functions can be found at:

https://kymata.org/api/functions/<KID>

where <KID> is the function's KID. The API is currently undergoing development, and more details on the schema will be available when this is complete.

Datasets

The pathways in Kymata are generated from a single set of electromagnetic measurements of the human cortex, called a dataset.

Periodically, we record a new, bigger dataset, and re-generate the pathway map. Each new dataset is given an ID (1.00, 2.00, 2.01 etc), with higher version numbers denoting that the dataset is more recent. As with each dataset we are trying to reduce the noise, the higher numbered datasets should also generate the more accurate pathway maps. Kymata displays the map generated from the latest dataset by default, so for most users, you will always be accessing the most accurate version of the map. However, the old map is still assessable in Kymata: for instance, when a user shares or cites a permalink, this links back to the archived version of the map, with an alert informing the user that the map they are looking at has been superseded. This ensures that when a user has cites a map result in an academic setting, this (now archived) map is still available to a reader.

The current map changes over time in accordance with the functions that have been fed into it as hypotheses. Thus it is always possible that between citing a current result, and the data being archived, the map may have changed. However, in most cases, these changes are likely to be marginal.

In addition to the pathways being made available for each dataset, the dataset itself (that is, the raw recordings and the stimuli) are also made available for reuse, under a Creative Commons Attribution 4.0 International License. These, together with documentation and information about how the datasets were generated can be accessed on the datasets section of Kymata.

Documentation > Data Conventions

Function meta-data

Although each function is characterised by an equation or algorithm, there are other things we record about it. The first, and most important, is the KID (Kymata ID). This is the intended method for referring to a function. However, these IDs mean very little to a new user, so the name of the function is the function's secondary (although often more common) identifier in papers. For instance a function called 'CIEL*A*B* lightness' is the name given to the function which characterised the dimension of 'lightness' of the average image in the CIEL*A*B* colour space. Unlike KIDs, function names may change due to language usage, so the correct KIDs should always be referenced at least once in academic papers. Each function also has an author and a reference, that is, the individual or individuals who first published this function, and where this was published. This reference may be to a journal article, or the web address of the source code. Some functions have more than one author, or are the combination of several functions by several different authors. The author 'KGH' refers to the 'Kymata Hypothesis Group', which means that the function was generated in-house by the Kymata development team. In many cases, authorship may not be clear or may be convoluted (a function may be the representation of an intuition given in a paper by a different author, or have many authors) and we have tried attribute these functions correctly where possible. Functions that include other functions as subfunctions do not generally have the authors of the subfunctions included, as they are cited in the subfunctions themselves.

Other metadata includes an overview of the function, an equation and an equation explanation. These descriptions and equations are not intended to be exhaustive. Most of the functions in Kymata are very complicated, and the equation and equation explanation are often simplified for reasons of space. It is highly recommended that users consult the provided reference to familiarise themselves with the actual function.

Tags provide a set of keywords that describe a function, but again, are not exhaustive due to the complexity of most functions.

Function naming conventions

Function names are largely arbitrary. In most cases, the function name either relates to the conventional mathematical term (eg. calculate_hilbert_envelope()) or the name given to it by the author ( calculate_instantaneous_loudness()). As all functions are 'calculating' or 'transforming' an input, we often leave these two terms out. eg. hilbert_envelope() and instantaneous_loudness().

Unfortunately, these names may be ambiguous to a new user; there here are many competing loudness functions and it may be that, for reasons of clarity, the function name has to be amended, say to cambridge_instantaneous_loudness(). But this means that all publications that have referenced the function using the previous name will now be referring to a function that does not exist or, worse, whose name has been reassigned. To avoid this, we assign all functions a Kymata ID, or KID, a unique identifier which will stay with the function even if its name changes. In the case of instantaneous_loudness(), this KID is qrlfe.

It is always recommended that users use a function's KID to refer to a function. See citing for more details.

References

A. Turing (1950) "Computing, Machinery and Intelligence". Mind 49:433-460.
D. McCandless, P. Doughty-White, M. Quick (2014) "Million lines of code" informationisbeautiful.net
A. Turing (1950) "Computing, Machinery and Intelligence". Mind 49:433-460.
A. Turing (1951) "Can digital computers think?: a 1951 BBC radio lecture". The Essential Alan Turing (ed B. J. Copeland), Clanendon Press.
A. Thwaites, E. Wieser, A. Soltan, I. Nimmo-Smith, I. Zulfiqar (in prep.) "Kymata, a directed graph of information processing pathways in the human cortex" TBC
A. Klein, J. Tourville (2012) "101 labeled brain images and a consistent human cortical labeling protocol" Frontiers in Brain Imaging Methods. 6:171. DOI: 10.3389/fnins.2012.00171
A. Klein, E. Neto, S. Ghosh, N. Nichols, F. Bao, J. Giard, Y. Hame, M. Reuter, J. Tourville (2016) "Mindboggle" mindboggle.info