Forem: vinay

Creating our own color theme in vscode

vinay — Mon, 24 May 2021 23:21:23 +0000

the dilemma

We've all been there. The need to please our eyes when you see a particular block of code. You like some nuances from a colour scheme and some things you just don't like at all. You like a colour scheme, use it for a while, but still there is that voice telling you that this can be better, your experience with writing code still needs improvement.

This led me to edit out some of the colors I just don't like in some of the color themes I was using. At first I was drawn to the simplicity of Predawn. but the oranges didn't work for me. The minimalistic choice of colors is fine, but not good enough for me. Then I found material darker with high contrast. but ugh the color palette is not minimalistic like predawn.
like the italics on the theme for comments is so icky. Talking about colour palettes I know a better dark palette waiting anyway. Nord. So one unproductive Saturday morning I forced myself to edit out the colour palette in settings(json).

workbench.colorCustomizations: {}

like the cursor colour, background, taking out the italics and stuff. Down the rabbit hole, I was reading on how to create your own color-scheme from an existing one.

The Material Themes darker high contrast theme looked like a starting path for me to start tweaking, since the UI indicators and separation of high contrast colors is good enough for me. enough of small talk let's get into the three easy steps of creating a color scheme

apply your own favourite color scheme by hitting

cmd+shift+p > Preferences: Color theme > <select the theme>

convert the existing theme to json format

cmd+shift+p > Developer: Generate Color theme from Current settings

this will create an untitled file with particular color palette and settings for every customisable UI element in vscode.

tweak the colors to your liking.

some of the tweaks I've made are

colors: {
// changed all blues to #5E81AC like
"activityBarBadge.background": "#5E81AC",
// main editor background
"editor.background": "#212121",
// current line number to be more focused
"editorLineNumber.activeForeground": "#eeffff",
// list explorer items
"list.highlightForeground": "#5E81AC",
// didn't like the terminal cursor colour
"terminalCursor.foreground": "#5E81AC",
}

removed every unnecessary italics (like in comments)

convert the json to vscode color theme extension. for this step you'll npm

install yo and code-generator

npm install -g yo generator-code

run yo code and select color theme from list of options

the prompt will ask you if you want to create color theme from an existing one or start afresh.
select create a fresh color theme and give some name to it.
now go to <theme-name>/themes/<theme-name>-color-theme.json and replace it with the untitled file you've edited before.

Now that your extension is ready, you may need to enable the theme and check it out.

copy the entire folder <theme-name> to ~/.vscode/extensions/

cp -r <theme-name> ~/.vscode/extensions/

restart the editor, and hit cmd+shift+p > Preferences: color theme > you'll see your <theme-name> there.

The comprehensive way to create an extension can be found here

check out my color theme at predusk

vinaybommana / predusk

predawn and material high contrast theme for vscode

I'll try to publish the extension in vscode marketplace in the future, for now the theme lives on at Github.

give a ❤️ if you like this article. let's discuss down below, which theme you are currently using ✨

How I store Screenshot data in my Linux work environment

vinay — Tue, 12 May 2020 20:26:07 +0000

In my work environment, Screen capture and taking screenshots is a common thing to share the completed status. Ubuntu has the feature of screen capture in three ways:

PrtScr will capture the entire screen and will save it to Pictures
Shift + PrtScn will capture part of the screen my making the cursor to a plus sign (same as cmd + shift + 4 in mac) and will the screen capture to Pictures.
Ctrl + Shift + PrtScr will save the screen selection to clipboard.

The Problem

The first two options save captured images to the local Pictures folder. which is good, but image name will be like this Screenshot from 2019-10-30 06-45-37.png , after a while you lose track of dates and Pictures folder will just be a mess of Screenshot from lying around without any info.

Solution

Simple bash scripting and automating it with crontab for a particular time everyday solved this. First of all, I wanted to organise all the screenshots based on date, like every screenshot will be placed in a folder based on the date it was taken.

screenshot seperator

I’ve placed this in crontab to run every 12 hours.

Screenshots taken everyday will be a new folder and all the images of the particular date will be transferred to the respective folder. This leads to large number of folders every month. For minimalism, we can just organise these folders further to that particular month, compress them to a tar file and store them.

compress old screenshots

This small snippets searches for Screenshots according to month, creates a folder, compresses the folder after moving all the dates of the month to that particular folder.

vinaybommana/bashNotes
github.com

Reading Manga with Python

vinay — Tue, 12 May 2020 20:21:39 +0000

Photo by Miika Laaksonen on Unsplash

What is Manga ?

Manga (漫画, manga) are comics or graphic novels created in Japan or using the Japanese language and conforming to a style developed in Japan in the late 19th century. They have a long and complex pre-history in earlier Japanese art.

let’s say manga is Japanese comics which are more popular and interesting than most of the main stream comics.

Scouting

Let’s learn some WebScraping and get some value instead of just getting data, let us download some manga from internet and try to read it.

Reading manga online is easy, you just go to some site like mangapanda.com search some comics and read it. what if you want to download the entire comic compress each chapter to a particular volume and read it offline.

when we go to mangapanda.com and search for a particular comic like say naruto here’s what the URL we are directed to

Notice the naruto at the end of the URL, now if we go to the first chapter of naruto the URL transforms to http://www.mangapanda.com/naruto/1 that’s just great for us. Note that this doesn’t happen with all the manga sites out there and watch out for that before trying to scrape any other manga site. we are trying to download the images that exists in naruto chapter 1

Let’s write a small function to get the image from the URL

OK, what is happening here. for the _download_image we are giving URL say mangapanda.com/naruto/1/3 according to our observation we are downloading naruto’s chapter 1 image 3 . let’s breakdown the function and understand what’s going on for each line.

requests.get download the source of the given URL
convert the source code html document into lxml html tree this helps us to parse tags easily
get the tags with img with id=’img’ the expression, ensures that.

".//img[@id ='img']/@src"
after we get the image URL download the image with requests.get(URL).content

Downloading the entire chapter

It’s good that the chapters are in the format /chapter/page_number so how can we download all the images of a particular chapter if we don’t know the ending chapter number. if we know the ending chapter then we can simply using range and loop over the image number to download.

if we see the source code there is this interesting tag.

There wrote this so that users can select the page number in the form of a dropdown. we can use the lxml format tree for this .//*[@id =’pageMenu’]/option[last()]/text() and get the last occurence of the pageMenu id which is the end page of the chapter.

let’s write wrap this up in a small function

now, we know the page numbers of the chapters we are going to download. we can just get all the images from the chapter in parallel, sort them and then compress them to make a single volume.

let’s use ThreadPoolExecutor and write an async function for the following job.

properties = json.load(open("configs.json"))

base_url = properties.get("base_url") + "/" + properties.get("manga_name")

we can define manga_name and base_url in configs.json so that we don’t have to give name of the manga every time we download a chapter.

download_chapter function creates directories based on the manga_name and chapter number

➜  naruto git:(master) ✗ tree
.
└── 1
    ├── 1.jpg
    ├── 10.jpg
    ├── 11.jpg

Now that we’ve downloaded all the pages in the chapter. let’s compress it in CBZ format and ensure that the order of the page numbers is sorted properly

we can wrap everything up with a classic main so that if we give chapter number we will download the entire comic

In action

we can run the script in the following way

Disclaimer: this is for pure educational purpose only. Do not use this commercially for piracy or for attacking mangapanda.com

Django + MySQL, How to port your web application from SQLite to MySQL

vinay — Tue, 12 May 2020 20:16:29 +0000

Django’s Object Relational Mapping Pattern

A model is the single, definitive source of data about your data. It contains the essential fields and behaviors of the data you’re storing. Generally, each model maps to a single database table.

we’ve learnt from the official docs that models.py in Django’s folder structure in your web application is the source of data. it contains everything you want to store in your database. we generally define tables, pre and post save methods etc., in models

we want a table with the following requirements, table should contains name, count, and timestamp of the metric. we’ll create a table in the following way.

SimpleMetricTable

Elephant in the Room

After few months we’ve realised that the sqlite database which is the default when creating the project is not scaling. when you have multiple sources which create and modify your DataObjects from your SimpleMetricTable we need to move on.

MySQL to the rescue

django.db.backends.sqlite3' this is how we tell django to use sqlite as backend db. we’ll configure mysql database backend first and then tell django to use django.db.backends.mysql.

the proper configuration would in the following form

this can be done in settings.py

login to your mysql database and create a database like metrics using CREATE DATABASE metrics; we’ll define this database in mysql.conf

polls/configs/mysql.conf

But hey 👋🏻 what about the data I’ve collected so far ? how do we port the old data from sqlite.db to our new shiny MySQL.

Dumping db to JSON in Django

simply run the following command

Note that this should be done before changing the database from sqlite to MySQL in settings.py , after you’ve changed the database from sqlite we can simply run

After a bunch of IntegrityErrors and a couple of google searches using flags like --exclude auth.permission --exclude contenttypes while dumping data we’ve successfully ported our application to MySQL.

The Problem

you’ve patted yourself in the back making a good day at work, started packing up for the day while watching your api output in the log. 👁 your application starts throwing 500 from some of the write requests. you check the code and everything looks fine. some of the requests are fine but those 500 makes you sit again.

Somewhere at the back of your mind there is an itch that this is due to the MySQL change you’ve done today. you start checking what type of data is being stored and written into your database. a thousand google searches follow.

Then you realise the mistake, the SimpleMetricTable you’ve ported from sqlite to MySQL has latin character sets and is not accepting utf-8 in MySQL

SELECT CCSA.character_set_name FROM information_schema.`TABLES` T,
       information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA
WHERE CCSA.collation_name = T.table_collation
  AND T.table_schema = "metrics"
  AND T.table_name = "metrics";

ALTER TABLE metrics CONVERT TO CHARACTER SET utf8;

and you are good to go.

footnotes:
this was actually written a while ago in medium
medium link

Analyzing Twitter data with Python: Part 1

vinay — Wed, 26 Feb 2020 14:19:15 +0000

The Question

What if we want to understand the impact of the tweet by a user on particular topic. let's say a user tweeted about a particular product like shoe laces on twitter, how likely are his followers going to buy that product based on his tweet.

let's analyze this scenario using machine learning by constructing a simple model. we'll get data from twitter directly and try to filter and clean the data to train our model. let's see how much can we learn from this.

We'll break down the entire process into the following steps:

In Part 1 we'll focus on gathering and cleaning the data,

Understanding the Flow

Gathering Data
Cleaning Data

Gathering Data

The main aspect of analyzing twitter data is to get the data. How can we get twitter data in large amount, like 10 million tweets on a particular topic.

we can access twitter data from Twitter's Developer access token authorization.
we can scrape twitter directly and get the data.

Accessing from twitter's developer access token

you can simply apply and get access token, which is useful for getting tweets using twitter api. we can use tweepy for that.

The Problem

The problem with using tweepy and twitter's api is, there is a rate limit of number of twitter calls from a particular user per hour. if we want large amount of data like a 10 million tweets this will take forever. Searching through tweets between a particular period was not effective while using twitter's api for me. Under these circumstances I've decided to scrape the twitter's data using an amazing library in python called twitterscraper.

Scraping Twitter directly

let's install twitterscraper

The best thing about twitterscraper is we can give the topic name, period and limit of tweets and the output format in which the tweets are to be obtained.

for the sake of understanding let's download 1000 tweets and try to clean them.

# twitterscraper <topic> --limit <count> --lang <en> --output filename.json
twitterscraper python --limit 1000 --lang en --output ~/backups/today\'stweets.json

the output format from the twitterscraper is in the form of json. let's try to convert the data we've obtained into a dataframe and clean it.

Cleaning Data

loading the downloaded `json` to a `pandas dataframe`

import codecs
import json
import pandas as pd
pd.options.mode.chained_assignment = None
# this enables us for rewriting dataframe to previous variable
from typing import List, Dict

json_twitter_data = pd.read_json(open("<path to json file>"))
json_twitter_data.head()

let's clean the data now, from the head() we can eliminate url, html and replies and also likes for now. we'll get back to likes afterwards.

# dropping html, url, likes and replies
json_twitter_data.drop(columns=['html', 'url', 'likes', 'replies'], inplace=True)

We need to add user and fullname columns. and get user_ids of the user.


# renaming column names
json_twitter_data.columns = ['fullname', 'Tweet_id', 'retweets', 'Tweet', 'Date', 'user']
twitter_data_backup = json_twitter_data
json_twitter_data.head()

Note the retweet column in the dataframe we can assume that the post having retweets will have larger impact on the users. so let's filter the tweets with tweets more than zero

json_twitter_data = json_twitter_data[json_twitter_data.retweets != 0]
json_twitter_data.head()

in the data we can have one user tweeting multiple tweets, we need to seperate users based on the tweet count.


# first remove  date column
twitter_data_with_date = json_twitter_data
json_twitter_data.drop(columns=['Date', 'Tweet'], inplace=True)
json_twitter_data.head()

now group the dataframe based on users

# rather than dropping duplicated we can `groupby` in pandas
# twitter_data.duplicated(subset='user', keep='first').sum()
tweet_count = twitter_data.groupby(twitter_data.user.tolist(),as_index=False).size()
# tweet_count['mastercodeonlin']

tweet_count is simply a dictionary and we can access now, the tweets count of a particular user

we can add the no of tweets column to the dataframe

json_twitter_data['no_of_tweets'] = json_twitter_data['user'].apply(lambda x: get_tweet_count(x))

twitter_data_without_tweet_count = json_twitter_data.drop_duplicates(subset='user', keep="first")
twitter_data_without_tweet_count.reset_index(drop=True, inplace=True)
twitter_data_without_tweet_count.head()

In the next part we'll focus on getting the user_ids of particular user, and analyzing the dataframe by converting it into numerical format.

Stay tuned, we'll have some fun...

Forem: vinay

Creating our own color theme in vscode

the dilemma

vinaybommana / predusk

predawn and material high contrast theme for vscode

How I store Screenshot data in my Linux work environment

The Problem

Solution

Reading Manga with Python

What is Manga ?

Scouting

Downloading the entire chapter

In action

Django + MySQL, How to port your web application from SQLite to MySQL

Django’s Object Relational Mapping Pattern

Elephant in the Room

MySQL to the rescue

Dumping db to JSON in Django

The Problem

Analyzing Twitter data with Python: Part 1

The Question

Understanding the Flow

Gathering Data

Accessing from twitter's developer access token

The Problem

Scraping Twitter directly

Cleaning Data

loading the downloaded json to a pandas dataframe

loading the downloaded `json` to a `pandas dataframe`