My work at TSU
Educational and research activity in Tomsk State University
Tomsk State University has been my main place of work for more than 15 years. I am very happy to be in touch with the students, as their community is fantastically positive and allows me to stay contemporary. My main courses are the following:
- Natural language processing: provides students with wide range of topics in the area of computational linguistics, including text preprocessing, tokenization, stemming, lemmatization, word embedding and a set of popular modern architechtures (Word2vec, Seq2seq, Attention, Transformer, BERT, GPT, Stable Diffusion and so on). Students learn how to use various NLP tools and libraries, such as NLTK and HuggingFace Transformers, as well as general machine learning frameworks, like PyTorch.
- The UNIX operating system: designed for deep understanding of one of the most reliable and widely used operating systems in the world. Students learn about the history and development of UNIX, its architecture and design principles, as well as its key features and capabilities. They gain practical experience with UNIX commands and tools, and learn how to use them to manage files, processes, and networks.
- Analysis of mathematical and social networks: ยป provides an in-depth understanding of the principles and methods for analyzing complex networks, both mathematical and social. The course is designed to equip students with the skills to model and analyze network structures, as well as to interpret the obtained results. It covers a wide range of topics, including graph theory, network metrics, centrality measures and community detection algorithms. Students learn how to use networks analysis tools, such as NetworkX. They will also learn how to apply advanced analytical techniques to uncover patterns and insights from network data.
- Software Engineering: provides an overview of the fundamental principles and concepts in software development. It introduces students to the key aspects of software development, including design, implementation, testing, and maintenance. The course covers a wide range of topics, from the basics of programming to advanced techniques in project management and quality assurance. Students learn about the different stages of the software development lifecycle, as well as the tools and methodologies used in the industry.
At the university, I support several areas of research and development, including research in the area of natural language processing and assistive technologies for the blind. I have become much interested in researching the possibility of using multi-head attention to solve applied problems for the Russian language. Among other things, I am very interested in morphological tagging for abbreviations in the text, i.e. for the cases when the annotated word is not fully accessible, and its understanding largely depends on the reader's perception. For sake of NLP research, I've started open projects:
- SelfTagger: a morphological tagger on multi-head attention for the Russian language
- Inlandes: a Java library with support of declarative queries for quick text preprocessing and transformation
Please do not forget to take a look at my GitHub profile @marigostra!