Collocations
What are collocations?
Collocations are words that typically occur with a given search word. If two words make up a collocation it means that this particular combination of words is statistically more likely to occur than combinations of the two words with other words. The word stråtækt occurs almost exclusively in combination with the word hus – it is over-represented, and therefore stråtækt is a significant collocate of hus. Dette also occurs in combination with hus, but this word is in itself much more frequent. In contrast to stråtækt, dette also collocates with a number of other words. It is therefore not a significant collocate of hus.
Collocates may reveal something about the meaning of a word. From the statistics alone it is easy to realize that bevægelse can be either concrete (glidende, roterende, langsom) or it can designate an organisation (nyreligiøs, folkelig, demokratisk).
They may also be used to detect set phrases or to investigate the variation of a phrase: anden etnisk baggrund, på den lange bane or trådløst netværk.
You can see the examples of a collocation by clicking on a collocate word in the result list. This will perform a new query for the search word in combination with the collocate.
Settings for the search word
There are two ways of specifying collocations: Part of speech and inflected forms. If you choose the option Include inflected forms you have to enter a base form in the search box, and the statistics will be computed on the basis of all inflected forms of the word. If you enter an inflected form and choose Include inflected forms the statistics will be computed for the entered form only.
If there is more than one word with the same base form you may want to restrict the computation to just one of the words by selecting a particular part of speech, for instance have, v.
Corpora
Only the corpora Korpus 90 and Korpus 2000 are available for collocations – it is not possible to compute collocations for the entire KorpusDK.
Korpus 90 is the default choice because Korpus 2000 is slightly biased, favouring newspaper texts, whereas Korpus 90 is more balanced and therefore gives a more reliable picture of the language.
Statistical computation
There are various ways of calculating collocations. At KorpusDK the default setting is the method Mutual Information, but you can change the method if you wish. Just go to the search result page and change the position in the box Statistical functions. Read more about statistical functions
Viewing the result
The search result is displayed as a list of words typically cooccurring with the search word. The words are ranked according to their score, with the most significant word at the top. As default setting, there is one list of collocates, but you can also choose a view where the collocates are grouped according to part of speech.

