<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Wincent Balin</title>
    <description>The latest articles on Forem by Wincent Balin (@wincentbalin).</description>
    <link>https://forem.com/wincentbalin</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F173747%2F280428f8-652f-4439-9ab4-cfb40162b734.png</url>
      <title>Forem: Wincent Balin</title>
      <link>https://forem.com/wincentbalin</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/wincentbalin"/>
    <language>en</language>
    <item>
      <title>Closure</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Sat, 05 Aug 2023 06:08:34 +0000</pubDate>
      <link>https://forem.com/wincentbalin/closure-69m</link>
      <guid>https://forem.com/wincentbalin/closure-69m</guid>
      <description>&lt;p&gt;After a pause, this series comes to a conclusion, mostly because of the rapid developments in the area of large language models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Original intention
&lt;/h2&gt;

&lt;p&gt;At the beginning I intended to create a language model, that would have gotten a prompt "Geschirrabwaschgesetz" (a law about washing dishes) and write me a corresponding law text in German.&lt;/p&gt;

&lt;p&gt;I was discouraged from training the original &lt;a href="http://karpathy.github.io/2015/05/21/rnn-effectiveness/"&gt;char RNN&lt;/a&gt; because of the scary amount of training time with a 110 M training data. Therefore I went with fine-tuning a &lt;a href="https://huggingface.co/dbmdz/german-gpt2"&gt;German GPT-2&lt;/a&gt; (and later &lt;a href="https://huggingface.co/benjamin/gerpt2"&gt;the better one&lt;/a&gt;; thanks Jo!). The fine-tuning process of such a model is described &lt;a href="https://www.philschmid.de/fine-tune-a-non-english-gpt-2-model-with-huggingface"&gt;here&lt;/a&gt; or &lt;a href="https://blog.oliverflasch.de/german-gpt2/"&gt;here&lt;/a&gt;, for example.&lt;/p&gt;

&lt;h2&gt;
  
  
  (Un-)expected discovery
&lt;/h2&gt;

&lt;p&gt;I happened to discover that my intended case is covered perfectly by the &lt;a href="https://huggingface.co/jphme/Llama-2-13b-chat-german"&gt;LLAMA 2 Chat German&lt;/a&gt; model (almost, because of a few grammatical errors). This is very likely because of being fine-tuned with the &lt;a href="https://huggingface.co/datasets/Christoph911/German-legal-SQuAD"&gt;German legal SQuAD&lt;/a&gt; dataset, among others.&lt;/p&gt;

&lt;p&gt;I do not want to withhold the result from you (produced in &lt;a href="https://lmstudio.ai"&gt;LM Studio&lt;/a&gt;): &lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5bif6X9r--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/h36f7kb1t0h0j03a4h1r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5bif6X9r--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/h36f7kb1t0h0j03a4h1r.png" alt='Output to "Geschirrabwaschgesetz"' width="800" height="903"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Just look at this beauty! It even defined "Hygiene" in the last subparagraph! And hence this series is concluded.&lt;/p&gt;

</description>
      <category>ai</category>
    </item>
    <item>
      <title>Build law text corpus</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Thu, 03 Aug 2023 07:11:22 +0000</pubDate>
      <link>https://forem.com/wincentbalin/build-law-corpus-kpn</link>
      <guid>https://forem.com/wincentbalin/build-law-corpus-kpn</guid>
      <description>&lt;p&gt;In this part of series, I will describe, how to create a corpus of German law texts from &lt;a href="https://www.gesetze-im-internet.de"&gt;https://www.gesetze-im-internet.de&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Previously in series
&lt;/h2&gt;

&lt;p&gt;In the previous parts of this series, we downloaded 6518 German laws, in XML format, stored in ZIP files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conversion to plain text
&lt;/h2&gt;

&lt;p&gt;Converting XML documents to plain text format can be accomplished with many tools and technologies, but after thorough considerations about a couple of edge cases I decided to use an XSLT stylesheet.&lt;/p&gt;

&lt;p&gt;After studying the &lt;a href="http://www.gesetze-im-internet.de/dtd/1.01/gii-norm.dtd"&gt;DTD file&lt;/a&gt;, which was referenced in the XML files, as well as the XML files themselves, following tasks had to be addressed (the paths given use XPath notation):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The XML files have root element &lt;code&gt;/dokumente&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The laws are either incredibly short and consist of a single paragraph, or rather long with a table of contents&lt;/li&gt;
&lt;li&gt;In the first case from 2., the law name is in &lt;code&gt;metadaten/enbez&lt;/code&gt; and &lt;code&gt;metadaten/titel&lt;/code&gt; (if the first path is present) or in &lt;code&gt;metadaten/enbez&lt;/code&gt; only; in the second case ibid, the title is in &lt;code&gt;norm/metadaten/langue&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The text body is always in &lt;code&gt;textdaten&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The paragraphs are in the &lt;code&gt;P&lt;/code&gt; tags and end with a new line&lt;/li&gt;
&lt;li&gt;The definition lists are in &lt;code&gt;DL&lt;/code&gt; tags and are rendered similar to paragraphs, but without new line after the last entry&lt;/li&gt;
&lt;li&gt;The new line in text has &lt;code&gt;BR&lt;/code&gt; tag, but is not rendered if being within a table or a list entry&lt;/li&gt;
&lt;li&gt;Table of contents (&lt;code&gt;TOC&lt;/code&gt; tags) are excluded, as they repeat paragram titles only and thus senseless in language model training; also, they are unusable in case of plain text, as there are no known page numbers&lt;/li&gt;
&lt;li&gt;Titles (&lt;code&gt;Title&lt;/code&gt; tags) are rendered with appended new line&lt;/li&gt;
&lt;li&gt;Tables (&lt;code&gt;table&lt;/code&gt; tags) are rendered with rows (&lt;code&gt;row&lt;/code&gt; tags) ending with a new line and all single cells but the last in row one (&lt;code&gt;entry&lt;/code&gt; tags) with a tab character appended&lt;/li&gt;
&lt;li&gt;The end marker of the law text will be 25 empty lines&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;And hence the short XSLT stylesheet of about 100 lines:&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;Run it in Windows using &lt;a href="http://web.archive.org/web/20140812045521/http://www.microsoft.com/en-us/download/details.aspx?id=21714"&gt;msxsl.exe&lt;/a&gt; as XSLT processor like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;msxsl BJNR001270871.xml giitotext.xsl &amp;gt; BJNR001270871.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Concatenating the text files creates a law text corpus.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next step
&lt;/h2&gt;

&lt;p&gt;In the next part of series we will see how to train a language model with the text corpus we just created.&lt;/p&gt;

</description>
      <category>python</category>
      <category>xslt</category>
    </item>
    <item>
      <title>Fetch German laws</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Sat, 26 Jun 2021 22:44:44 +0000</pubDate>
      <link>https://forem.com/wincentbalin/fetch-german-laws-4g4</link>
      <guid>https://forem.com/wincentbalin/fetch-german-laws-4g4</guid>
      <description>&lt;p&gt;In this part of series, I will describe, how to fetch German law texts from &lt;a href="https://www.gesetze-im-internet.de"&gt;https://www.gesetze-im-internet.de&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Four formats
&lt;/h2&gt;

&lt;p&gt;The (federal) laws in Germany are published by the &lt;a href="https://www.bmjv.de/"&gt;Federal Ministry of Justice and Consumer Protection&lt;/a&gt; on &lt;a href="https://www.gesetze-im-internet.de"&gt;https://www.gesetze-im-internet.de&lt;/a&gt;. There are also land (i.e. state) laws, published &lt;a href="http://www.justiz.de/onlinedienste/bundesundlandesrecht/index.php"&gt;here&lt;/a&gt;, administrative regulations, published &lt;a href="http://www.verwaltungsvorschriften-im-internet.de/"&gt;here&lt;/a&gt;, and many more laws, but for the sake of simplicity we will use the texts of federal laws only.&lt;/p&gt;

&lt;p&gt;As stated in the &lt;a href="http://www.gesetze-im-internet.de/hinweise.html"&gt;notes page&lt;/a&gt;, there are four formats available:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTML (which you can view in browser)&lt;/li&gt;
&lt;li&gt;PDF (most suitable for archive or for printed documents)&lt;/li&gt;
&lt;li&gt;EPUB (for e-book readers)&lt;/li&gt;
&lt;li&gt;XML (original format, which can be converted easily to other formats)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The format of the XML representation is defined by &lt;a href="http://www.gesetze-im-internet.de/dtd/1.01/gii-norm.dtd"&gt;this DTD&lt;/a&gt;, which will become very helpful in the next part of this series.&lt;/p&gt;

&lt;p&gt;As also stated on the mentioned above notes page, the index XML documents is available at &lt;a href="http://www.gesetze-im-internet.de/gii-toc.xml"&gt;http://www.gesetze-im-internet.de/gii-toc.xml&lt;/a&gt;. This index links to XML documents, packed into ZIP archives, all  of them having the same name &lt;code&gt;xml.zip&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The choice of the format
&lt;/h2&gt;

&lt;p&gt;From the four available formats, we need the one, which represents the resulting text with the least markup. The requirement comes from the need to generate a future law text with as little markup as possible.&lt;/p&gt;

&lt;p&gt;This requirement, of course, eliminates the PDF format, because it is adapted to the printed media. While the HTML format could be converted to text, for example with the veritable &lt;a href="http://www.aaronsw.com/2002/html2text/"&gt;html2text&lt;/a&gt;, the contents of law texts are split between small sections, hence complicating the conversion. The conversion of the EPUB format to text is difficult to customise, at least in comparison to XML. Finally, for XML format, there is already a converter to plain text, described in &lt;a href="https://ofdigitalwater.postach.io/post/convert-german-laws-from-xml-to-text"&gt;another post&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So we need the documents in XML format.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to parse HTML with batteries included
&lt;/h2&gt;

&lt;p&gt;Even before &lt;a href="https://www.crummy.com/software/BeautifulSoup/"&gt;Beautiful Soup&lt;/a&gt;, it was possible to parse HTML data using the class &lt;code&gt;HTMLParser&lt;/code&gt; from the package &lt;code&gt;html.parser&lt;/code&gt;, documented &lt;a href="https://docs.python.org/3/library/html.parser.html#html.parser.HTMLParser"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Also, even before &lt;a href="https://docs.python-requests.org/"&gt;requests&lt;/a&gt;, it was possible to fetch data over HTTP with the functions &lt;code&gt;urlopen&lt;/code&gt; and &lt;code&gt;urlretrieve&lt;/code&gt; from the package &lt;code&gt;urllib.request&lt;/code&gt;, documented &lt;a href="https://docs.python.org/3/library/urllib.request.html#urllib.request.urlopen"&gt;here&lt;/a&gt; and &lt;a href="https://docs.python.org/3/library/urllib.request.html#urllib.request.urlretrieve"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Should you ask yourself at this point, why do I overlook two very nice and tried Python packages, please read the list under &lt;strong&gt;First things first&lt;/strong&gt; in &lt;a href="https://ofdigitalwater.postach.io/post/generate-german-laws"&gt;this article&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To parse HTML with the &lt;code&gt;HTMLParser&lt;/code&gt; class, you simply create a subclass from it. Then, depending on what you need to get from HTML data, you implement the &lt;code&gt;handle_*&lt;/code&gt; methods. For example, to parse links from the &lt;a href="https://www.gesetze-im-internet.de"&gt;https://www.gesetze-im-internet.de&lt;/a&gt; front page, you need the following code:&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;h2&gt;
  
  
  Collecting all XML documents
&lt;/h2&gt;

&lt;p&gt;While, as mentioned above, there is a list of XML documents &lt;a href="http://www.gesetze-im-internet.de/gii-toc.xml"&gt;here&lt;/a&gt;, we will try to collect URLs of all XML documents from the list of current documents at &lt;a href="http://www.gesetze-im-internet.de/aktuell.html"&gt;http://www.gesetze-im-internet.de/aktuell.html&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The parser implemented for this page is similar to the previous example. As the current documents are grouped by the first character into separate lists, this parser collects the links to these lists:&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;As all links to document lists are stored in the variable &lt;code&gt;partial_list_urls&lt;/code&gt;, we must add another parser to fetch the links to XML documents. This parser also stores law names.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;h2&gt;
  
  
  Complete fetch code
&lt;/h2&gt;

&lt;p&gt;If we combine the two examples, and add some error handling and some &lt;code&gt;urlretrieve&lt;/code&gt; action as well, we get this:&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;After executing this code, we get 6518 ZIP files into the cache directory.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next step
&lt;/h2&gt;

&lt;p&gt;In the next step, we will build the text corpus from all the law texts fetched.&lt;/p&gt;

&lt;p&gt;Stay tuned!&lt;/p&gt;

</description>
      <category>python</category>
      <category>scraping</category>
      <category>webscraping</category>
    </item>
    <item>
      <title>Generate German laws</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Sat, 26 Jun 2021 20:08:52 +0000</pubDate>
      <link>https://forem.com/wincentbalin/generate-german-laws-k9c</link>
      <guid>https://forem.com/wincentbalin/generate-german-laws-k9c</guid>
      <description>&lt;p&gt;In this series, I am going to describe, how to build a generator of German laws.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First things first:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;I am doing this for my own amusement.&lt;/li&gt;
&lt;li&gt;Because of &lt;strong&gt;1.&lt;/strong&gt;, I will not necessarily seek simple ways to do things.&lt;/li&gt;
&lt;li&gt;You will most probably facepalm repeatedly reading the articles from this series.&lt;/li&gt;
&lt;li&gt;Given enough time, you will learn to enjoy &lt;strong&gt;3.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The goal
&lt;/h2&gt;

&lt;p&gt;… consists of four parts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fetch German laws from &lt;a href="https://www.gesetze-im-internet.de"&gt;https://www.gesetze-im-internet.de&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Build text corpus from the downloaded laws&lt;/li&gt;
&lt;li&gt;Train a char-RNN with the text corpus&lt;/li&gt;
&lt;li&gt;Create an easy to use generator of German laws&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The prototype for XML conversion was done in the &lt;a href="https://ofdigitalwater.postach.io/post/convert-german-laws-from-xml-to-text"&gt;previous article&lt;/a&gt; on this blog.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next step
&lt;/h2&gt;

&lt;p&gt;In the next post, we are going to create the code that fetches all the law texts.&lt;/p&gt;

&lt;p&gt;Stay tuned!&lt;/p&gt;

</description>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Convert German laws from XML to text using XSLT</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Sat, 26 Jun 2021 01:33:14 +0000</pubDate>
      <link>https://forem.com/wincentbalin/convert-german-laws-from-xml-to-text-3b2c</link>
      <guid>https://forem.com/wincentbalin/convert-german-laws-from-xml-to-text-3b2c</guid>
      <description>&lt;p&gt;For a small project, I needed to convert German laws, found at &lt;a href="https://www.gesetze-im-internet.de/"&gt;https://www.gesetze-im-internet.de/&lt;/a&gt;, from XML format to text format.&lt;/p&gt;

&lt;p&gt;The XML format is described &lt;a href="https://www.gesetze-im-internet.de/hinweise.html"&gt;here&lt;/a&gt; and is defined by &lt;a href="https://www.gesetze-im-internet.de/dtd/1.01/gii-norm.dtd"&gt;this DTD file&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The source code in the following XSL file is pretty straight-forward. Only adding newlines and indenting definition lists posed an additional challenge.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


</description>
      <category>xslt</category>
    </item>
    <item>
      <title>How to import large Plaso file into Timesketch in Docker</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Thu, 12 Mar 2020 20:20:22 +0000</pubDate>
      <link>https://forem.com/wincentbalin/how-to-import-large-plaso-file-into-timesketch-in-docker-5afc</link>
      <guid>https://forem.com/wincentbalin/how-to-import-large-plaso-file-into-timesketch-in-docker-5afc</guid>
      <description>&lt;p&gt;Sometimes &lt;a href="https://github.com/google/timesketch/"&gt;Timesketch&lt;/a&gt;, being run in &lt;a href="https://www.docker.com/"&gt;Docker&lt;/a&gt;, hiccups when importing a &lt;a href="https://github.com/log2timeline/plaso"&gt;Plaso&lt;/a&gt; file too large, like in the &lt;a href="https://github.com/google/timesketch/issues/1060"&gt;issue #1060&lt;/a&gt;. You can still upload the file using this shell script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/sh&lt;/span&gt;
&lt;span class="c"&gt;#&lt;/span&gt;
&lt;span class="c"&gt;# Run this script with timesketch_import_plaso.sh plaso_file [timesketch_container]&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nv"&gt;$# &lt;/span&gt;&lt;span class="nt"&gt;-eq&lt;/span&gt; 0]
&lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;Run this script with &lt;span class="nv"&gt;$0&lt;/span&gt; plaso_file &lt;span class="o"&gt;[&lt;/span&gt;timesketch_container]
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nv"&gt;DOCKER_PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/tmp/&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nb"&gt;basename&lt;/span&gt; &lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nv"&gt;TIMELINE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nv"&gt;$1&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s1"&gt;'s/\.[^.]*$//'&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nv"&gt;CONTAINER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;docker_timesketch_1
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$2&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nv"&gt;CONTAINER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$2&lt;/span&gt;
&lt;span class="k"&gt;fi

&lt;/span&gt;docker &lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CONTAINER&lt;/span&gt;&lt;span class="s2"&gt;:/tmp"&lt;/span&gt;
docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CONTAINER&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; psort.py &lt;span class="nt"&gt;-o&lt;/span&gt; timesketch &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TIMELINE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DOCKER_PATH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CONTAINER&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DOCKER_PATH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>docker</category>
      <category>plaso</category>
      <category>timesketch</category>
      <category>forensics</category>
    </item>
    <item>
      <title>Download links for Microsoft Windows Services for UNIX 3.5 and SUA</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Fri, 04 Oct 2019 21:54:16 +0000</pubDate>
      <link>https://forem.com/wincentbalin/download-link-for-microsoft-windows-services-for-unix-3-5-555p</link>
      <guid>https://forem.com/wincentbalin/download-link-for-microsoft-windows-services-for-unix-3-5-555p</guid>
      <description>&lt;p&gt;Should you want to use &lt;a href="https://en.wikipedia.org/wiki/Windows_Services_for_UNIX"&gt;Microsoft Windows Services for UNIX&lt;/a&gt; (SFU) within Windows XP or Windows Server 2003, you need SFU 3.5, which you will currently (October 2019) find either &lt;a href="https://archive.org/download/cdrom-services-unix-3.5-microsoft-2004"&gt;in the Internet Archive&lt;/a&gt; as an ISO image or &lt;a href="https://www.microsoft.com/en-us/download/details.aspx?id=20983"&gt;at Microsoft&lt;/a&gt; as setup executables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Addendum&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://en.wikipedia.org/wiki/Windows_Services_for_UNIX#Legacy"&gt;Subsystem for UNIX-Based Applications&lt;/a&gt; (SUA), which works with Windows 7, is also available &lt;a href="https://www.microsoft.com/en-us/download/details.aspx?id=2391"&gt;at Microsoft&lt;/a&gt;, as well as &lt;a href="https://www.microsoft.com/en-us/download/details.aspx?id=23754"&gt;SUA for Windows Vista&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>windows</category>
      <category>compiler</category>
    </item>
    <item>
      <title>If you want to run Docker on local Linux box</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Fri, 04 Oct 2019 11:47:16 +0000</pubDate>
      <link>https://forem.com/wincentbalin/if-you-want-to-run-docker-on-local-linux-box-1fj3</link>
      <guid>https://forem.com/wincentbalin/if-you-want-to-run-docker-on-local-linux-box-1fj3</guid>
      <description>&lt;p&gt;If you would like to run &lt;a href="https://www.docker.com/"&gt;Docker&lt;/a&gt; on a Linux box in your LAN, and you already configured the Linux box hostname as &lt;em&gt;computer1&lt;/em&gt; and a user account there as &lt;em&gt;me&lt;/em&gt;, and your current Docker environment is &lt;a href="https://docs.docker.com/toolbox/toolbox_install_windows/"&gt;Docker Toolbox on Windows&lt;/a&gt; together with &lt;a href="https://docs.docker.com/machine/"&gt;docker-machine&lt;/a&gt;, perform the following steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Add user &lt;em&gt;me&lt;/em&gt; to the group &lt;em&gt;sudo&lt;/em&gt; on your future Docker host: &lt;code&gt;usermod -a -G sudo me&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Remove password prompt when running &lt;code&gt;sudo&lt;/code&gt; (as described &lt;a href="https://kofler.info/sudo-ohne-passwort/"&gt;here&lt;/a&gt;):
Replace &lt;code&gt;%sudo ALL=(ALL) ALL&lt;/code&gt; with &lt;code&gt;%sudo ALL=(ALL) NOPASSWD: ALL&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Run this command in your current Docker environment to install Docker on your future Docker host: &lt;code&gt;docker-machine create --driver generic --generic-ip-address computer1 --generic-engine-port 2375 --generic-ssh-user me computer1&lt;/code&gt;. The last part is the name of the configuration in your current Docker environment.&lt;/li&gt;
&lt;li&gt;Activate the configuration in your current Docker environment: &lt;code&gt;eval $("C:\Program Files\Docker Toolbox\docker-machine.exe" env computer1)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Reverse step 2 and, if needed, also step 1&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Then you are ready to use Docker on your &lt;em&gt;computer1&lt;/em&gt; box.&lt;/p&gt;

&lt;p&gt;Perform step 4 to activate this configuration again.&lt;/p&gt;

</description>
      <category>docker</category>
      <category>linux</category>
    </item>
    <item>
      <title>Starting with Intel Galileo</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Tue, 01 Oct 2019 15:59:59 +0000</pubDate>
      <link>https://forem.com/wincentbalin/starting-with-intel-galileo-5hk</link>
      <guid>https://forem.com/wincentbalin/starting-with-intel-galileo-5hk</guid>
      <description>&lt;p&gt;&lt;a href="https://www.intel.com/content/www/us/en/support/products/78906/boards-and-kits/intel-galileo-boards.html"&gt;Intel Galileo&lt;/a&gt; is a Intel Pentium-based platform, which is supported by Arduino IDE. It runs Linux, and it is possible to install a PCI-Express card (most often the Intel Centrino Wi-Fi board gets installed).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--A_Pt7fCW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images.postach.io/fa097c2a-1c29-4702-93f4-ab4cf778aa39/e8b0153c-6cfd-4b5c-a5ea-ed05765823a6/671d3e4a-58c0-4bd7-9f5d-0bfb34533eb1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A_Pt7fCW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images.postach.io/fa097c2a-1c29-4702-93f4-ab4cf778aa39/e8b0153c-6cfd-4b5c-a5ea-ed05765823a6/671d3e4a-58c0-4bd7-9f5d-0bfb34533eb1.png" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It is also one of the &lt;a href="https://www.arduino.cc/en/ArduinoCertified/IntelGalileo"&gt;Arduino-certified&lt;/a&gt; platforms. The configuration in Arduino IDE consists in installing the &lt;strong&gt;Intel i586 boards&lt;/strong&gt; platform in the board manager. The firmware is uploaded though the &lt;em&gt;USB CLIENT&lt;/em&gt; port as usual.&lt;/p&gt;

&lt;p&gt;If you want to communicate with Linux on Galileo directly, you need the &lt;a href="https://www.intel.com/content/www/us/en/support/articles/000006343/boards-and-kits/intel-galileo-boards.html"&gt;cable adapter&lt;/a&gt;; if you would like to solder one by yourself, the pinout is available &lt;a href="https://www.pccables.com/Products/Intel-Galileo-Board-Serial-Cable-DB9-F-to-3.5mm.html"&gt;here&lt;/a&gt;. Connect to the board using serial port with 115200 bps, and voilà!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--L9KHX78_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images.postach.io/fa097c2a-1c29-4702-93f4-ab4cf778aa39/e8b0153c-6cfd-4b5c-a5ea-ed05765823a6/245d5662-065b-4f22-875c-8a9e169240ac.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--L9KHX78_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images.postach.io/fa097c2a-1c29-4702-93f4-ab4cf778aa39/e8b0153c-6cfd-4b5c-a5ea-ed05765823a6/245d5662-065b-4f22-875c-8a9e169240ac.png" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>hardware</category>
      <category>linux</category>
      <category>arduino</category>
    </item>
    <item>
      <title>Office in Vagrant VM</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Tue, 16 Jul 2019 16:48:25 +0000</pubDate>
      <link>https://forem.com/wincentbalin/office-in-vagrant-vm-30ed</link>
      <guid>https://forem.com/wincentbalin/office-in-vagrant-vm-30ed</guid>
      <description>&lt;p&gt;I wanted to use office software in a VM, while being able to edit files on the host machine. Usually, people create a VM in &lt;a href="https://www.virtualbox.org/"&gt;VirtualBox&lt;/a&gt; and map the host directory into this VM using &lt;a href="https://www.virtualbox.org/manual/ch04.html#sharedfolders"&gt;shared folders&lt;/a&gt;. But, because it is a long process, I decided to automate it using &lt;a href="https://vagrantup.com"&gt;Vagrant&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  TL;DR
&lt;/h1&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="https://github.com/wincentbalin/VirtualOffice"&gt;this GitHub repository&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Download &lt;code&gt;Vagrantfile&lt;/code&gt; and place it into your documents directory&lt;/li&gt;
&lt;li&gt;Open your favourite CLI and change into that directory&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;vagrant up&lt;/code&gt; to configure the VM and wait for a while&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;vagrant rdp&lt;/code&gt; and use &lt;em&gt;vagrant&lt;/em&gt; as login and as password&lt;/li&gt;
&lt;li&gt;Open LibreOffice and configure your personal data, so the documents with data fields can set them to appropriate values&lt;/li&gt;
&lt;li&gt;Edit your documents on the host system!&lt;/li&gt;
&lt;/ol&gt;

&lt;h1&gt;
  
  
  Design decisions
&lt;/h1&gt;

&lt;p&gt;The &lt;code&gt;Vagrantfile&lt;/code&gt; must be placed into directory with documents you want to edit. Of course, I could add the default documents directory on the host OS, but I decided against it: first, it would create additional maintenance burden (especially if the syntax for default OS paths changes), and second, it would not work on systems, where the documents directory was moved to another location. So, for now, the entry for synchronised directory is&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;vm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;synced_folder&lt;/span&gt; &lt;span class="s2"&gt;"."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"/home/vagrant/Documents"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;mount_options: &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"dmode=775,fmode=664"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The VM created by Vagrant is based on Ubuntu Bionic x64. The &lt;code&gt;Vagrantfile&lt;/code&gt; installs the packages &lt;code&gt;xubuntu-desktop&lt;/code&gt; and &lt;code&gt;libreoffice&lt;/code&gt;. Then it also enables Remote Desktop connexions by installing &lt;code&gt;xrdp&lt;/code&gt;, by starting it automatically as a service and by enabling port 3389. The forwarding to the port is configured with&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;vm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;network&lt;/span&gt; &lt;span class="s2"&gt;"forwarded_port"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;guest: &lt;/span&gt;&lt;span class="mi"&gt;3389&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;host: &lt;/span&gt;&lt;span class="mi"&gt;33389&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;protocol: &lt;/span&gt;&lt;span class="s2"&gt;"tcp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;auto_correct: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The automatic start of XFCE session is enabled by adding &lt;code&gt;xfce4-session&lt;/code&gt; to the file &lt;code&gt;.xsession&lt;/code&gt;. Everything is run as the default Vagrant user &lt;code&gt;vagrant&lt;/code&gt;.&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>vagrant</category>
    </item>
    <item>
      <title>Run IDLE (Python IDE) in virtual environment</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Fri, 28 Jun 2019 11:41:08 +0000</pubDate>
      <link>https://forem.com/wincentbalin/run-idle-python-ide-in-virtual-environment-35c3</link>
      <guid>https://forem.com/wincentbalin/run-idle-python-ide-in-virtual-environment-35c3</guid>
      <description>&lt;p&gt;Imagine: you are running software implemented in Python and there is a problem you would like to debug or edit away. The software resides in a virtual environment and apart from this virtual environment and a standard Python installation nothing else is installed (or is not permitted to be installed). What should you do?&lt;/p&gt;

&lt;p&gt;You can run &lt;a href="https://docs.python.org/3/library/idle.html"&gt;IDLE&lt;/a&gt; within the activated virtual environment with this command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; idlelib.idle
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command opens the starting window of IDLE with Python prompt. From there you can open the file you would like to edit.&lt;/p&gt;

&lt;p&gt;But what if you would open the Python file at once? Use this command then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; idlelib.idle filename
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While IDLE does not have all niceties of PyCharm, it is better than Notepad.exe, is almost always installed and has debugging capabilities. You might even enjoy it.&lt;/p&gt;

&lt;p&gt;Source: &lt;a href="https://gist.github.com/wincentbalin/411d89dea8a0a017eb4068cf5a007b2e"&gt;Run IDLE from a batch file&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
    </item>
    <item>
      <title>Python shebang</title>
      <dc:creator>Wincent Balin</dc:creator>
      <pubDate>Sat, 15 Jun 2019 11:08:02 +0000</pubDate>
      <link>https://forem.com/wincentbalin/python-shebang-26jc</link>
      <guid>https://forem.com/wincentbalin/python-shebang-26jc</guid>
      <description>&lt;p&gt;Currently, in the interregnum where Python 2 and Python 3 may co-exist on the same system, the &lt;a href="https://www.python.org/dev/peps/pep-0394/#recommendation"&gt;PEP 0394 recommendations&lt;/a&gt; for the &lt;a href="https://en.wikipedia.org/wiki/Shebang_(Unix)"&gt;shebang&lt;/a&gt; line in Python programs run in short like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use &lt;code&gt;#!/usr/bin/env&lt;/code&gt; before Python interpreter&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;#!/usr/bin/env python&lt;/code&gt; &lt;em&gt;only&lt;/em&gt; for programs that work with &lt;em&gt;both&lt;/em&gt; Python 2 and Python 3&lt;/li&gt;
&lt;li&gt;If your program runs with Python 3 only, replace &lt;code&gt;python&lt;/code&gt; in the previous line with &lt;code&gt;python3&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Else, for Python 2-based programs, replace &lt;code&gt;python&lt;/code&gt; with &lt;code&gt;python2&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>python</category>
    </item>
  </channel>
</rss>
