 Hello, my name is Zala, today I will show you how to load a set of text documents into orange and analyze their content in the word cloud. For this demonstration, I will use our file server, go to its data folder and open the folder Proposals to Government 1k. I have posted those and other links I will use in the description box below. The folder I opened contains about 1000 proposals to the Slovenian Government that we automatically translated to English. Let's look at the file 10014.text. It includes the proposal to ban the overtaking of trucks on the highways. The file 1016.text includes the proposal on road safety. Besides the proposal's text, the folder also includes meta information about each proposal stored in the YAML files. For example, let me download the 10014.yaml. It provides information such as the author of the proposal, the number of page views of the proposal and the text with the government response. There's also the title of the proposal, no overtaking of goods vehicles. I will now open orange. I have already installed the text add-on, but you can see how to do it in a separate tutorial. I will choose the import documents widget from this add-on and paste the server address. It might take a while to load all the documents. Finally, the widget has loaded 1093 documents from the server. I will use the corpus widget to choose the feature orange will recognize as the documents title and denote that the text I would like to analyze is stored in the feature called content. I can now use corpus viewer to view the documents. Our truck overtaking ban proposal is the second one on the list. Fine. Let me extract the most frequent words in this corpus. First, I will use a text preprocessor. The widget proposes the most common preprocessing tasks and I will accept them. I will convert the text into lowercase, extract the words, lemmatize them and remove the stop words and numbers. I can check the outcome of this preprocessing and tokenization in the corpus viewer by turning on show tokens and tags. Okay, now to the word cloud. It looks like work, vehicle, state and pay are among the top used words in the proposals. I can now select a word like vehicle which occurs 968 times in 265 documents. Here they are listed in the corpus viewer. Let me look at my workflow again. I have imported the text corpus from the file server, used the corpus widget to define which variable stores the title and the text content and then preprocessed the text and showed the frequently occurring words in the word cloud. In orange, most visualizations are interactive. So I can choose the words in the word cloud and check the documents that contain selected words. This video was actually about loading the corpus into orange. While I have used our file server to store the documents, you can keep them in your local folder. For example, I will download the zipped proposals. Here are the files. Now I can use the import documents widget to load the documents from my local folder.