Bibliography

Project publications

2021

 1. Dobrovoljc, Kaja (2021, in preparation). Leksikon formulaičnih besednih nizov v pisni in govorjeni slovenščini.
 2. Gantar, Polona (2021, in print). Zapis frazeoloških enot v Leksikonu večbesednih enot za slovenščino.
 3. Krek, Simon; Gantar, Polona (2021, in print). Analiza vezljivostnih vzorcev v sodobni standardni slovenščini.
 4. Krek, Simon; Gantar, Polona (2021, in print). Mehanizem za luščenje in prepoznavanje VLE v korpusu.
 5. Krek, Simon; Gantar, Polona; Kosem, Iztok; Dobrovoljc, Kaja; Laskovski, Cyprian; Krsnik, Luka; Brank, Janez; Arhar Holdt, Špela; Čibej, Jaka; Robnik Šikonja, Marko; Klemenc, Bojan; Gorjanc, Vojko (2021). Multiword Expressions lexicon extracted from the Gigafida 2.1 corpus, Slovenian language resource repository CLARIN.SI.
 6. Krek, Simon; Gantar, Polona; Krsnik, Luka; Laskowski, Cyprian; Dobrovoljc, Kaja; Arhar Holdt, Špela; Čibej, Jaka; Kosem, Iztok; Klemenc, Bojan; Robnik Šikonja, Marko; Gorjanc, Vojko (2021). Valency lexicon extracted from the Gigafida 2.1 corpus, Slovenian language resource repository CLARIN.SI. http://hdl.handle.net/11356/1418.
 7. Krek, Simon; Kosem Iztok; Gantar; Polona (2021, in preparation). Opis modela za pridobivanje in strukturiranje kolokacijskih podatkov iz korpusa.
 8. Krek, Simon; Kosem, Iztok; Gantar, Polona; Arhar Holdt, Špela; Robnik Šikonja, Marko; Klemenc, Bojan; Dobrovoljc, Kaja; Čibej, Jaka; Laskovski, Cyprian; Krsnik, Luka; Gorjanc, Vojko (2021). Frequency lists of collocations from the Gigafida 2.1 corpus, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1415.
 9. Škvorc, Tadej; Gantar, Polona; Robnik-Šikonja, Marko (2021, in print). Strojno prepoznavanje idiomov z globokimi nevronskimi mrežami.

2020

 1. Arhar Holdt, Špela; Čibej, Jaka; Laskowski, Cyprian; Krek, Simon (2020)Morphological patterns from the Sloleks 2.0 lexicon 1.0, Slovenian language resource repository CLARIN.SI,http://hdl.handle.net/11356/1411.
 2. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2020). Frequency lists of word-level n-grams from the GOS 1.0 corpus 1.1, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1365.
 3. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2020). A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora. Ljubljana: Ljubljana University Press, Faculty of Arts. https://doi.org/10.4312/9789610604006.
 4. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2020). A Guide to Frequency Lists from the Gigafida 2.0 and GOS 1.0 Corpora. Ljubljana: Ljubljana University Press, Faculty of Arts. https://doi.org/10.4312/9789610604006.
 5. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2020). Frequency lists of character-level n-grams from the GOS 1.0 corpus 1.1, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1363.
 6. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2020). Consonant-vowel structures in the GOS 1.0 corpus 1.1, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1367.
 7. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2020). Frequency lists of word parts from the GOS 1.0 corpus 1.1, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1366.
 8. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2020). Frequency lists of words from the GOS 1.0 corpus 1.1, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1364.
 9. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2020). Consonant-vowel structures in the Gigafida 2.0 corpus, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1289.
 10. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2020). Frequency lists of word-level n-grams from the GOS 1.0 corpus 1.1, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1365.
 11. Čibej, Jaka; Arhar Holdt, Špela; Krek, Simon (2020). List of word relations from the Sloleks 2.0 lexicon 1.0, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1386.
 12. Dobrovoljc, Kaja (2020). ”Identifying dictionary-relevant formulaic sequences in written and spoken corpora”. International Journal of Lexicography, vol. 33, issue 4, pg. 417–442. https://doi.org/10.1093/ijl/ecaa008.
 13. Dobrovoljc, Kaja; Roblek, Rebeka; Vianello, Chiara; Diaci, Ajda; Vuga, Zala (2020). List of formulaic sequences in spoken Slovenian, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1279.
 14. Dobrovoljc, Kaja; Roblek, Rebeka; Vianello, Chiara; Diaci, Ajda; Vuga, Zala (2020). List of formulaic sequences in standard written Slovenian, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1280.
 15. Kosem, Iztok; Krek, Simon; Gantar, Polona (2020). ”Defining collocation for Slovenian lexical resources”. Slovenščina 2.0, vol. 8, issue 2, pg. 1–27. https://revije.ff.uni-lj.si/slovenscina2/article/view/9338.
 16. Kosem, Iztok; Krek, Simon; Gantar, Polona (2020). ”Defining collocation for Slovenian lexical resources”. Slovenščina 2.0, vol. 8, issue 2, pg. 1–27. https://revije.ff.uni-lj.si/slovenscina2/article/view/9338.
 17. Krek, Simon; Arhar Holdt, Špela; Erjavec, Tomaž; Čibej, Jaka; Repar, Andraž; Gantar, Polona; Ljubešić, Nikola; Kosem, Iztok; Dobrovoljc, Kaja (2020). ”Gigafida 2.0: The Reference Corpus of Written Standard Slovene”. IN: Nicoletta Calzolari (ed.): LREC 2020: Twelfth International Conference on Language Resources and Evaluation: conference proceedings, pg. 3340–3345. Paris: ELRA – European Language Resources Association. https://www.aclweb.org/anthology/2020.lrec-1.409.
 18. Krek, Simon; Erjavec, Tomaž; Dobrovoljc, Kaja; Gantar, Polona; Arhar Holdt, Špela; Čibej, Jaka; Brank, Janez (2020). ”The ssj500k training corpus for Slovene language processing”. IN: Darja Fišer, Tomaž Erjavec (ed.): Language Technologies & Digital Humanities: conference proceedings, str 24–33 Ljubljana: Institute of Contemporary History. http://nl.ijs.si/jtdh20/pdf/JT-DH_2020_Krek-et-al_The-ssj500k-Training-Corpus-for-Slovene-Language-Processing.pdf.
 19. Škvorc, Tadej; Gantar, Polona; Robnik-Šikonja, Marko (2020). Dataset of Slovene idiomatic expressions SloIE, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1335.

2019

 1. Bon, Mija; Gantar, Polona (2019). ”Levels of annotation in the Slovene Training Corpus ssj500k 2.2”. Jazykovedný časopis, 10th International Conference NLP, Corpus Linguistics, Language Dynamics and Change, Bratislava, Slovakia, vol. 70, issue 2, pg. 390–399. https://doi.org/10.2478/jazcas-2019-0068.
 2. Brank, Janez (2019). Q-CAT Corpus Annotation Tool 1.1, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1282.
 3. Čibej, Jaka (2019). ”LIST – orodje za kvantitativne slovnične raziskave korpusov”. Lecture at Nova slovnica sodobne standardne slovenščine: viri in metode, Ljubljana. https://videolectures.net/novaSlovnicaLjubljana_cibej_raziskave_korpusov/.
 4. Čibej, Jaka (2019). LIST: Orodje za kvantitativno analizo korpusov: priročnik za uporabo https://slovnica.ijs.si/wp-content/uploads/2019/11/LIST_prirocnik_1.0.pdf.
 5. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2019). Frequency lists of character-level n-grams from the Gigafida 2.0 corpus, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1272.
 6. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2019). Frequency lists of words from the Gigafida 2.0 corpus, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1273.
 7. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2019). Frequency lists of word-level n-grams from the Gigafida 2.0 corpus, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1274.
 8. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2019). Frequency lists of word parts from the Gigafida 2.0 corpus, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1275.
 9. Čibej, Jaka; Arhar Holdt, Špela; Dobrovoljc, Kaja; Krek, Simon (2019). Frequency lists of word-level n-grams from the Gigafida 2.0 corpus, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1274.
 10. Dobrovoljc, Kaja (2019). ”Annotating formulaic sequences in spoken Slovenian: structure, function and relevance”. IN: Annemarie Friedrich, Deniz Zeyrek, Jet Hoek (ed.): LAW XIII, The 13th Linguistic Annotation Workshop, conference proceedings, pg. 108–112 Firenze, Italy. Stroudsburg: The Association for Computational Linguistics. https://www.aclweb.org/anthology/W19-4013/.
 11. Dobrovoljc, Kaja (2019). Q-CAT: Orodje za ročno označevanje in analizo besedilnih korpusov: priročnik za uporabohttps://slovnica.ijs.si/wp-content/uploads/2019/10/Q-CAT_prirocnik.pdf.
 12. Dobrovoljc, Kaja: “Slovnične analize ročno označenega korpusa ssj500k z orodjem Q-CAT”. Lecture at Nova slovnica sodobne standardne slovenščine: viri in metode, Ljubljana 2019. https://videolectures.net/novaSlovnicaLjubljana_dobrovoljc_slovnicne_analize/.
 13. Dobrovoljc, Kaja; Erjavec, Tomaž, Ljubešić, Nikola (2019). ”Improving UD processing via satellite resources for morphology”. IN: UDW 2019, Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019), conference proceedings, pg. 24 –34. Paris, France. Stroudsburg: Association for Computational Linguistics. https://www.aclweb.org/anthology/W19-8004/.
 14. Dobrovoljc, Kaja; Martinc, Matej (2019). ”Er ... well, it matters, right? On the role of data representations in spoken language dependency parsing”. IN: Marie-Catherine de Marneffe, Teresa Lynn, Sebastian Schuster (ed.): Second Workshop on Universal Dependencies (UDW 2018), conference proceedings, pg. 37–46. Brussels. Strasbourg: Association for Computational Linguistics. https://www.aclweb.org/anthology/W18-6005/.
 15. Gantar, Polona; Arhar Holdt, Špela; Čibej, Jaka; Kuzman, Taja (2019). ”Structural and Semantic Classification of Verbal Multi-Word Expressions in Slovene”. Contributions to Contemporary History, vol. 59, issue 1, pg. 99–119. https://ojs.inz.si/pnz/article/view/325.
 16. Gantar, Polona; Čibej, Jaka; Bon, Mija (2019). ”Slovene multi-word units: identification, categorization, and representation”. IN: Gloria Corpas Pastor, Ruslan Mitkov, (ed.): Computational and corpus-based phraseology, conference proceedings, pg. 99–112. Cham: Springer. https://link.springer.com/chapter/10.1007%2F978-3-030-30135-4_8.
 17. Kosem, Iztok; Gantar, Polona; Krek, Simon; Arhar Holdt, Špela; Čibej, Jaka; Laskowski, Cyprian; Pori, Eva; Klemenc, Bojan; Dobrovoljc, Kaja; Gorjanc, Vojko; Ljubešić, Nikola (2019). Collocations Dictionary of Modern Slovene KSSS 1.0, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1250.
 18. Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž; Može, Sara; Ledinek, Nina; Holz, Nanika; Zupan, Katja; Gantar, Polona; Kuzman, Taja; Čibej, Jaka; Arhar Holdt, Špela; Kavčič, Teja; Škrjanec, Iza; Marko, Dafne; Jezeršek, Lucija; Zajc, Anja (2019). Training corpus ssj500k 2.2, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1210.
 19. Krsnik, Luka; Arhar Holdt, Špela; Čibej, Jaka; Dobrovoljc, Kaja; Ključevšek, Aleksander; Krek, Simon; Robnik-Šikonja, Marko (2019). Corpus extraction tool LIST 1.2, Slovenian language resource repository CLARIN.SI, https://www.clarin.si/repository/xmlui/handle/11356/1276.
 20. Krsnik, Luka; Arhar Holdt, Špela; Čibej, Jaka; Dobrovoljc, Kaja; Ključevšek, Aleksander; Krek, Simon; Robnik-Šikonja, Marko (2019). Corpus extraction tool LIST 1.2, Slovenian language resource repository CLARIN.SI, https://www.clarin.si/repository/xmlui/handle/11356/1276.
 21. Ljubešić, Nikola; Dobrovoljc, Kaja (2019). ”What does neural bring? Analysing improvements in morphosyntactic annotation and lemmatisation of Slovenian, Croatian and Serbian”. IN: Tomaž Erjavec et al. (ed.): 7th Workshop on Balto-Slavic Natural Language Processing, conference proceedings, pg. 29–34. Firenze, Italy. Stroudsburg: The Association for Computational Linguistics. https://www.aclweb.org/anthology/W19-3704/.
 22. Škvorc, Tadej; Krek, Simon; Pollak, Senja; Arhar Holdt, Špela; Robnik Šikonja, Marko (2019). ”Predicting Slovene text complexity using readability measures”. Contributions to Contemporary History, vol. 59, issue 1, pg. 198–220. https://ojs.inz.si/pnz/article/download/323/605.
 23. Škvorc, Tadej; Robnik-Šikonja, Marko (2019). ”Prepoznavanje idiomatskih besednih zvez z uporabo besednih vložitev”. Uporabna Informatika, vol. 27, issue 3. https://uporabna-informatika.si/index.php/ui/article/view/63.

2018

 1. Arhar Holdt, Špela; Čibej, Jaka (2018). ”Oblikoslovni vzorci v leksikonu Sloleks: izhodiščni nabor za samostalnike”. Slovnične raziskave za jezikovni opis. Slovenščina 2.0, Thematic issue, vol. 6, issue 2, pg. 33–66. https://www.dlib.si/details/URN:NBN:SI:DOC-C6R9113Q.
 2. Dobrovoljc, Kaja (2018). ”Formulaičnost v slovenskem jeziku”. Slovnične raziskave za jezikovni opis. Slovenščina 2.0, Thematic issue, vol. 6, issue 2, pg. 67–95. http://www.dlib.si/?URN=URN:NBN:SI:DOC-IYNQSMXC.
 3. Dobrovoljc, Kaja (2018). ”N-gram Frequency Lists for Reference Corpora of Slovenian Language”. IN: Darja Fišer, Andrej Pančur (ed.): Language Technologies & Digital Humanities: conference proceedings, pg. 47–54. Ljubljana: Ljubljana University Press, Faculty of Arts. http://www.sdjt.si/wp/wp-content/uploads/2018/09/JTDH-2018_Dobrovoljc-K_Frekvencni-seznami-n-gramov-v-korpusih-slovenskega-jezika.pdf.
 4. Dobrovoljc, Kaja (2018). Gos corpus n-grams 2.0, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1195.
 5. Dobrovoljc, Kaja (2018). IMP corpus n-grams 2.0, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1194.
 6. Dobrovoljc, Kaja (2018). Kres corpus n-grams 2.0, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1193.
 7. Dobrovoljc,Kaja (2018). Janes corpus n-grams 1.0, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1192.
 8. Gantar, Polona; Arhar Holdt, Špela; Čibej, Jaka; Kuzman, Taja; Kavčič, Teja (2018). ”Verbal multi-word expressions in the Slovene training corpus ssj500k 2.1”. IN: Darja Fišer, Andrej Pančur (ed.): Language Technologies & Digital Humanities: conference proceedings, pg. 85–92. Ljubljana: Ljubljana University Press, Faculty of Arts. http://www.sdjt.si/wp/wp-content/uploads/2018/09/JTDH-2018_Gantar-et-al_Glagolske-vecbesedne-enote-v-ucnem-korpusu-ssj500k-2-1.pdf.
 9. Gantar, Polona; Arhar Holdt, Špela; Pollak, Senja (2018). ”Lexical novelties in computer-mediated communication”. Slavistična revija: časopis za jezikoslovje in literarne vede, vol. 66, issue 4, pg. 459–472. https://srl.si/ojs/srl/article/view/2018-4-1-4.
 10. Gantar, Polona; Štrkalj Despot, Kristina; Krek, Simon; Ljubešić, Nikola (2018). ”Towards semantic role labeling in Slovene and Croatian”. IN: Darja Fišer, Andrej Pančur (ed.): Language Technologies & Digital Humanities: conference proceedings, pg. 93–98. Ljubljana: Ljubljana University Press, Faculty of Arts. http://www.sdjt.si/wp/wp-content/uploads/2018/09/JTDH-2018_Gantar-et-al_Towards-Semantic-Role-Labeling-in-Slovene-and-Croatian.pdf.
 11. Kosem, Iztok; Krek, Simon; Gantar, Polona; Arhar Holdt, Špela; Čibej, Jaka; Laskowski, Cyprian Adam (2018). ”Collocations dictionary of modern Slovene”. IN: Jaka Čibej, Vojko Gorjanc, Iztok Kosem, Simon Krek (ed.): 18th EURALEX International Congress: Lexicography in Global Contexts: conference proceedings, pg. 989–997. Ljubljana: Ljubljana University Press, Faculty of Arts. https://euralex.org/wp-content/themes/euralex/proceedings/Euralex%202018/118-4-2939-1-10-20180820.pdf.
 12. Kosem, Iztok; Krek, Simon; Gantar, Polona; Arhar Holdt, Špela; Čibej, Jaka; Laskowski, Cyprian Adam (2018). ”Collocations dictionary of modern Slovene”. IN: Darja Fišer, Andrej Pančur (ed.): Language Technologies & Digital Humanities: conference proceedings, pg. 133–139. Ljubljana: Ljubljana University Press, Faculty of Arts. http://www.sdjt.si/wp/wp-content/uploads/2018/09/JTDH-2018_Kosem-et-al_Kolokacijski-slovar-sodobne-slovenscine.pdf.
 13. Ljubešić, Nikola (2018). Meta-tagger, programming code on GitHub, Slovenian language resource repository CLARIN.SI, https://github.com/clarinsi/meta-tagger.
 14. Rozman, Tadeja; Arhar Holdt, Špela; Pollak, Senja; Kosem, Iztok (2018). ”Kolokacije v korpusu Šolar”. Jezik in slovstvo, vol. 63, issue 2/3, pg. 117–128. https://www.jezikinslovstvo.com/pdf.php?part=2018%7C2-3%7C117-128.
 15. Škvorc, Tadej; Krek, Simon; Pollak, Senja; Arhar Holdt, Špela; Robnik Šikonja, Marko (2018). ”Evaluation of statistical readability measures on Slovene texts”. IN: Darja Fišer, Andrej Pančur (ed.): Language Technologies & Digital Humanities, conference proceedings, pg. 240–247. http://www.sdjt.si/wp/wp-content/uploads/2018/09/JTDH-2018_Skvorc-et-al_Evaluation-of-Statistical-Readability-Measures-on-Slovene-texts.pdf.

2017

 1. Gantar, Polona; Krek, Simon; Kuzman, Taja (2017). ”Verbal multiword expressions in Slovene”. IN: Ruslan Mitkov (ed.): Computational and corpus-based phraseology: conference proceedings, pg. 247–259. Cham: Springer. https://link.springer.com/chapter/10.1007/978-3-319-69805-2_18.

 

 

Source project literature

2021

 1. Dobrovoljc, Kaja (in preparation). Leksikalne prvine govorjenega jezika v uporabniških spletnih vsebinah: primer večbesednih diskurznih označevalcev. Doctoral dissertation.

2016

 1. Arhar Holdt, Špela; Dobrovoljc, Kaja (2016). »Vrednost korpusa Janes za slovensko normativistiko«. Slovenščina 2.0, vol. 4, issue (2), pg. 1–37. Ljubljana: Trojina, zavod za uporabno slovenistiko. https://www.dlib.si/details/URN:NBN:SI:DOC-2NJ0THO7.
 2. Krek, Simon; Gantar, Polona; Kosem, Iztok; Gorjanc, Vojko; Laskowski, Cyprian (2016). »Baza kolokacijskega slovarja slovenskega jezika«. IN: Tomaž Erjavec, Darja Fišer (ed.): Language Technologies & Digital Humanities 2016: conference proceedings, pg. 101–105. Ljubljana: Ljubljana University Press, Faculty of Arts. http://www.sdjt.si/wp/wp-content/uploads/2016/09/JTDH-2016_Krek-et-al_Baza-kolokacijskega-slovarja-slovenskega-jezika.pdf.
 3. Logar, Nataša; Arhar Holdt, Špela; Erjavec, Tomaž (2016). »Slovenski strokovni jezik: korpusni opis trpnika«. Erika Kržišnik, Miran Hladnik (ed.): Toporišičeva obdobja 35: symposium, pg. 237–245. https://centerslo.si/wp-content/uploads/2016/11/LogarArhHolErj.pdf.

2015

 1. Gantar, Polona (2015). Leksikografski opis slovenščine v digitalnem okolju. Ljubljana: Ljubljana University Press, Faculty of Arts. https://www.dlib.si/details/URN:NBN:SI:DOC-C6OT60O0.
 2. Gog, Simon; Moffat, Alistair; Petri, Matthias (2015). »On identifying phrases using collection statistics«. IN: Allan Hanbury, Gabriella Kazai, Andreas Rauber, Norbert Fuhr (ed.): Advances in Information Retrieval, ECIR 2015: conference proceedings pg. 278–283. Cham: Springer. https://doi.org/10.1007/978-3-319-16354-3_30.
 3. Lai, Siwei; Xu, Liheng; Liu, Kang; Zhao, Jun (2015). »Recurrent convolutional neural networks for text classification«. 29th American Association for Artificial Intelligence conference on Artificial Intelligence: conference proceedings, pg. 2267–2273. Austin, Texas: AAAI Press. https://dl.acm.org/doi/10.5555/2886521.2886636.
 4. Ramisch, Carlos (2015). Multiword Expressions Acquisition: A Generic and Open Framework. Cham: Springer. https://link.springer.com/book/10.1007/978-3-319-09207-2.
 5. Rao, Kanishka; Peng, Fuchun; San, Haşim; Beaufays, Françoise (2015). »Grapheme-to-phoneme conversion using long short-term memory recurrent neural networks«. IEEE International Conference on Acoustics, Speech and Signal Processing: conference proceedings, pg. 4225–4229. https://ieeexplore.ieee.org/document/7178767.
 6. Verdonik, Darinka (2015). »Internal variety in the use of Slovene general extenders in different spoken discourse settings«. International Journal of Corpus Linguistics, vol. 20, issue 4, pg. 445–468. https://doi.org/10.1075/ijcl.20.4.02ver.
 7. Zhang, Xiang; Zhao, Junbo; LeCun Yann (2015). »Character-level convolutional networks for text classification«. Advances in Neural Information Processing Systems, pg. 649–657. https://arxiv.org/abs/1509.01626.

2014

 1. Ahlin, Martin et al. (2014). Slovar slovenskega knjižnega jezika. Ljubljana: Mladinska knjiga.
 2. Kilgarriff, Adam et al. (2014). »The Sketch Engine: ten years on«. Lexicography, pg. 1–30. https://link.springer.com/article/10.1007/s40607-014-0009-9.
 3. Krek, Simon (2014). »Prva in druga izdaja SSKJ«. Slovenščina 2.0, vol. 2, issue 2, pg. 114–160. https://www.dlib.si/details/URN:NBN:SI:doc-IH81QTWY.
 4. Stramljič Breznik, Irena (2014). Medmeti v slovenskem jeziku. Maribor: Založba Pivec.

2013

 1. Arhar Holdt, Špela (2013). »Študentje, škratje in nadškofje: končnica -je v imenovalniku množine pri samostalnikih prve moške sklanjatve«. Slovenščina 2.0, vol. 1, issue 1, pg. 134–154. Ljubljana: Trojina, zavod za uporabno slovenistiko. http://www.dlib.si/details/URN:NBN:SI:doc-XX582TCH.
 2. Kosem, Iztok; Gantar, Polona; Krek, Simon (2013). »Automation of lexicographic work: an opportunity for both lexicographers and crowd-sourcing«. IN: Iztok Kosem, Jelena Kallas, Polona Gantar, Simon Krek, Margit Langemets, Maria Tuulik (ed.): Electronic lexicography in the 21st century: thinking outside the paper: conference proceedings, pg. 32–48. Ljubljana: Trojina, Institute for Applied Slovene Studies; Tallinn: Eesti Keele Instituut. https://www.dlib.si/details/URN:NBN:SI:DOC-HSTQ0XWM.
 3. Može, Sara (2013). FrameNet in večjezičnost: kontrastivna analiza glagolov premikanja v slovenščini in angleščini. Doctoral dissertation. University of Ljubljana.

2012

 1. Gantar, Polona (2012). »Slovnični in pomenski opisi v leksikalni bazi za slovenščino«. IN: Franc Marušič, Rok Žaucer (ed.): Škrabčevi dnevi 7: symposium proceedings, pg. 17–27 Nova Gorica: University of Nova Gorica. http://www.ung.si/~fmarusic/pub/marusic&zaucer_2012_skrabec_7.
 2. Gries, Stefan Th. (2012). »Frequencies, probabilities, and association measures in usage-/exemplar-based linguistics: some necessary clarification«. Studies in Language, vol. 11, issue 3, pg. 477–510. https://doi.org/10.1075/sl.36.3.02gri.
 3. Dobrovoljc, Kaja; Krek, Simon; Rupnik, Jan (2012). »Skladenjski razčlenjevalnik za slovenščino«. IN: Tomaž Erjavec, Jerneja Žganec Gros (ed.): Eith conference Language Technologies, conference proceedings, pg. 35–40. Ljubljana: Jožef Stefan Institute. http://nl.ijs.si/isjt12/proceedings/isjt2012_08.pdf.
 4. Logar Berginc, Nataša; Grčar, Miha; Brakus, Marko; Erjavec, Tomaž; Arhar Holdt, Špela; Krek Simon (2012). Korpusi slovenskega jezika Gigafida, KRES, ccGigafida in ccKRES: gradnja, vsebina, uporaba. Ljubljana: Trojina, zavod za uporabno slovenistiko. Digital edition publisher: Ljubljana University Press, Faculty of Arts (2020). https://doi.org/10.4312/9789610603542.
 5. Mendes, Pablo; Daiber, Joachim; Rajapakse, Rohana; Sasaki, Felix; Bizer, Christian (2012). »Evaluating the Impact of Phrase Recognition on Concept Tagging«. IN: Nicoletta Calzolari et al. (ed.): Eighth International Conference on Language Resources and Evaluation (LREC'12): conference proceedings, pg. 1277–1280. https://www.aclweb.org/anthology/L12-1307/.

2011

 1. Collobert, Ronan; Weston, Jason; Bottou, Léon; Karlen, Michael; Kavukcuoglu, Koray; Kuska, Pavel (2011). »Natural language processing (almost) from scratch«. Journal of Machine Learning Research, vol. 12, issue 76, pg. 2493–2537. https://www.jmlr.org/papers/v12/collobert11a.html.
 2. Verdonik, Darinka; Zwitter Vitez, Ana (2011). Slovenski govorni korpus Gos. Ljubljana: Trojina, zavod za uporabno slovenistiko. Digital edition publisher: Ljubljana University Press, Faculty of Arts (2020). https://doi.org/10.4312/9789610603528.

2010

 1. Baldwin, Timothy; Kim, Su Nam (2010). »Multiword expressions«. IN: Nitin Indurkhya, Fred J. Damerau (ed.): Handbook of Natural Language Processing, Second Edition, pg. 267-292. Boca Raton, Florida: CRC Press. https://people.eng.unimelb.edu.au/tbaldwin/pubs/handbook2009.pdf.
 2. Bybee, Joan (2010). Language, Usage and Cognition. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511750526.
 3. Davies, Mark; Gardner, Dee (2010). A Frequency Dictionary of American English: Word Sketches, Collocates, and Thematic Lists. Abingdon: Routledge.
 4. Pecina, Pavel (2010). »Lexical association measures and collocation extraction«. IN: Nicoletta Calzolari, Nancy Ide (ed.): Language Resources and Evaluation, issue 44, pg. 137–158. Praha: Institute of Formal and Applied Linguistics. https://link.springer.com/article/10.1007/s10579-009-9101-4.
 5. Ramisch, Carlos; Villavicencio, Aline; Boitet, Christian (2010). »Mwetoolkit: a Framework for Multiword Expression Identification«. IN: Nicoletta Calzolari et al. (ed.): Seventh International Conference on Language Resources and Evaluation LREC'10: conference proceedings, pg. 662–669. Valletta: European Language Resources Association (ELRA). https://www.aclweb.org/anthology/L10-1553/.

2009

 1. Biber, Douglas (2009). »A corpus-driven approach to formulaic language in English: Multi-word patterns in speech and writing«. International Journal of Corpus Linguistics, vol. 14 issue 3, pg. 275–311. https://doi.org/10.1075/ijcl.14.3.08bib.

2008

 1. Erjavec, Tomaž; Krek, Simon (2008). »The JOS morphosyntactically tagged corpus of Slovene«. 6th International Conference on Language Resources and Evaluation LREC'08: conference proceedings, pg. 322–326. Marrakech: European Language Resources Association (ELRA). https://www.aclweb.org/anthology/L08-1451/.
 2. Gantar, Polona (2008). »(Slovenska) leksika med leksikonom in slovnico«. Jezik in slovstvo, vol. 53, issue 5, pg. 19–35. Ljubljana: Slavistično društvo Slovenije. https://www.dlib.si/details/URN:NBN:SI:doc-BSE6C2VP.
 3. Hanks, Patrick (2008). Mapping meaning onto use: a Pattern Dictionary of English Verbs. Utah: AACL. [PDF].
 4. Žele, Andreja (2008). Vezljivostni slovar slovenskih glagolov. Ljubljana: Založba ZRC, ZRC SAZU. https://zalozba.zrc-sazu.si/p/820.

2007

 1. Čermák, František (2007). Frekvenční slovník mluvené češtiny. Praha: Karolinum. https://karolinum.cz/en/books/cermak-frekvencni-slovnik-mluvene-cestiny-2743.
 2. Gantar, Polona (2007). Stalne besedne zveze v slovenščini: korpusni pristop. Ljubljana: Založba ZRC, ZRC SAZU. https://doi.org/10.3986/9789612540364.

2006

 1. Jakop, Nataša (2006). Pragmatična frazeologija. Ljubljana: Založba ZRC, ZRC SAZU. https://doi.org/10.3986/9616568493.
 2. Krek, Simon; Kilgarriff, Adam (2006). »Slovene word sketches«. IN: Tomaž Erjavec, Jerneja Žganec Gros (ed.): Language Technologies IS-LTC 2006: conference proceedings, pg. 62–67. Ljubljana: Jožef Stefan Institute. https://doi.org/10.4312/9789610601111.

2005

 1. Hanks, Patrick; Pustejovsky, James (2005). »A Pattern Dictionary for Natural Language Processing«. Revue Française de Linguistique Appliquée, vol. 10, issue 2, pg. 63–82. https://www.cairn.info/revue-francaise-de-linguistique-appliquee-2005-2-page-63.htm.
 2. Hoey, Michael (2005). Lexical Priming: A new theory of Words and Language. London: Routledge.
 3. Wray, Alison (2005). Formulaic Language and the Lexicon. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511519772.

2004

 1. Biber, Douglas; Conrad, Susan; Cortes, Viviana (2004). »If you look at ...: Lexical Bundles in University Teaching and Textbooks«. Applied Linguistics, vol. 25, issue 3, pg. 371-405. https://doi.org/10.1093/applin/25.3.371.
 2. Čermák, František; Křen, Michal (2004). Frekvenční slovník češtiny. Praha: Nakladatelství Lidové noviny.
 3. Hanks, Patrick (2004). »Corpus Pattern Analysis«. IN: Geoffrey Williams, Sandra Vessier (ed.): EURALEX 2004: conference proceedings, pg. 87–97. Lorient: Université de Bretagne‐Sud. https://euralex.org/publications/corpus-pattern-analysis/.
 4. Kilgarriff, Adam; Rychlý, Pavel; Smrz, Pavel; Tugwell David (2004). »The Sketch Engine«. IN: Geoffrey Williams, Sandra Vessier (ed.): EURALEX 2004: conference proceedings, pg. 105–116. Lorient: Université de Bretagne‐Sud. https://euralex.org/publications/the-sketch-engine/.
 5. Nivre, Joakim; Nilsson, Jens (2004). »Multiword Units in Syntactic Parsing«. IN: Gaël Dias et al. (ed.): MEMURA 2004, Workshop at LREC 2004: conference proceedings, pg. 39–46. Paris: ELRA. http://www.lrec-conf.org/proceedings/lrec2004/ws/ws6.pdf.
 6. Schmitt, Norbert (2004). Formulaic sequences: Acquisition, processing, and use. Amsterdam: John Benjamins Publishing. https://doi.org/10.1075/lllt.9.
 7. Toporišič, Jože (2004). Slovenska slovnica. Maribor: Založba Obzorja.

2003

 1. Fillmore, Charles J.; Johnson, Christopher R.; Petruck, Miriam R. L. (2003). »Background to Framenet«. International Journal of Lexicography, vol. 16, issue 3, pg. 235–250. https://doi.org/10.1093/ijl/16.3.235.

2002

 1. Sag, Ivan A; Baldwin, Timothy; Bond, Francis; Coperstake, Ann; Flickinger, Dan (2002). »Multiword Expressions: A Pain in the Neck for NLP«. IN: Alexander Gelbukh (ed.): Computational Linguistics and Intelligent Text Processing. CICLing 2002: conference proceedings, pg. 1–15. Berlin: Springer. https://doi.org/10.1007/3-540-45715-1_1.

2001

 1. Toporišič, Jože et al. (2001). Slovenski pravopis. Ljubljana: ZRC SAZU.

1999

 1. Biber, Douglas et al. (1999). Longman Grammar of Spoken and Written English. Harlow: Pearson Education.
 2. Moguš, Milan et al. (1999). Hrvatski čestotni rječnik. Zagreb: Zavod za lingvistiku Filozofskog fakulteta.

1998

 1. Mel’čuk, Igor (1998). »Collocations and Lexical Functions«. IN: A.P. Cowie (ed.): Phraseology. Theory, Analysis, and Applications, pg. 23–53. Oxford: Clarendon Press.

1991

 1. Sinclair, John (1991). Corpus, Concordance, Collocation. Oxford: Oxford University Press.

1987

 1. Sinclair, John (1987). Collins COBUILD English Language Dictionary. Glasgow: Collins.

1985

 1. Quirk, Randolph et al. (1985). A Comprehensive Grammar of the English Language. London: Longman.

1967

 1. Kučera, Henry; Francis, Winthrop Nelson (1967). Computational analysis of present-day American English. Providence: Brown University Press.

1957

 1. Firth, John Rupert (1957). Modes of Meaning. Papers in Linguistics. London: Oxford University Press, 1934–51.
TOP