Saturday, October 31, 2015

The Minimalists

I am sure that I have told you many times that I just love this siteThe Minimalists . I am literally learning tons from each and every article. It's a way of life, a way of thinking and, for me, a way to live more - to be more alive and be more aware of what I have in my life.

As a Student of the Arcane art of Programming, I thought why not have my own personal version of the Site for off-line reading. So, as I have lately been doing much study regarding Web Scraping I thought why not experiment with having the Awesome Blogs which I read often.

Fast forward a couple of hours, I had a set of 424 text files which have the text content from the various native blog posts on The Minimalists;P

Actually, it was a great experience and in the process I have learned :-

> To respect the site by slowing down the spiders ( robots/crawlers)

> To look up the robot.txt, to find out the explicit permission for crawlers

> How to avoid walking down the rabbit holes, something I am starting to be aware of in my nature.


Things which still need to be done as far as my Scraping skills are concerned :-

> Actually converting and formatting the text files to a proper, Nook-legible PDF ( or other formats ) perhaps via LaTex using the PyLatex Package.

> My Code still doesn't inform me whether there was an Image or a Video on the page, or whether the page no longer exists. Minimal intelligence might make for a better crawler, in any case it'd be far less unsophisticated than what the professionals use - think Google, Baidu et cetera.

> My Code isn't quite sound, it is still more like code-and-throw-away kind. For example, I am not really using functions which are well designed and which would be a step towards Reusable Code.

> I'd like the content to really look like the way it's should. Right now it's just the bare bones txt files, where as a bit more sophisticated PDF makes for a pleasant reading.


Next up - Text Processing, LEVEL UP!

Fight-o;P

Sunday, October 25, 2015

Carrying it around?

Rewind a couple years back, I was really a sucker for "Buying the Books" and having to carry them with yourself All-The-Time!

Not so much anymore 'cos basically anything that I might possibly run into has already been solved and documented on some forum on the Internet, To quote Sherlock Holmes " There is nothing new under the Sun", then why would I burden myself with so much physical luggage.

Yes, I do understand that the feeling of a book in one's hand might seem more real and if I am doing some non-technical reading - I'd definitely look for a physical book as well. But, from what my experience so far speaks, it's that most technical books are like written Sheet Music, of course you care going to benefit if you read it all thoroughly, personally I am more of a learning-by-doing kind, I need to play out that Music on an actual Instrument.

I need to do experiments to learn something, I need to make silly mistakes to learn and I need to be able to feel what a understanding of the subject would look like!

For the Programming,  Mathematics and Physics related books, I very much prefer to have them in a digital format and, of course, a good internet connection to overcome the speed-breakers that I often run into.


Friday, October 23, 2015

Internet - A Gigantic API

Web Scraping is becoming a passion of mine. When you come to think of it, it involves much in the ToolKit, say, Language Processing, Databases, Regular Expressions and so on and so forth.

Isn't designing a spider much like designing a miniature software with it's proper set of variations, specifications and even Testing/Debugging as well?

Besides it just feels good to know that you can only automate this long and dreary Ctrl+S process once you really understand the website!

Of course, this improves the know-how about the Internet and what are the ethical boundaries of automating things and when we might be harming a website/server.

You know, while we are talking about generalizations - I am definitely a Believer in Python and it's sheer popularity in the programming world. Chances are that if you have a programming problem it has already been taken care of, by some library.

Seriously people, one can learn entire computer science course from Python only ranging from the typical "Intro. to Computer Science " to " Digital Image Processing" and "Cryptography" courses.

For some reason, it seems to be against Basic Human Rights, to make Education Fun!

[Edit 26-Oct-2015 ]

Doesn't feel right to end anything on a negative note, so I am gonna rephrase this one -

" Eucation is one of the Basic Human Rights, why shouldn't we make it fun? "

Monday, October 19, 2015

Fast and Furious learning curve;P

This past month has involved a lot of coding in Anaconda scripts and I genuinely liked the coding part. I had to juggle through all three major platforms - Windows, Ubuntu and MacOS

Trust me, without the wonderful Anaconda ( no, not the Snake silly;D ) I wouldn't have been able to do anything at all. 

I relied on Anaconda for Scraping, Visualization and dealing with various Data formats. In the process I realized over and over again that not being able to deal with Pandas and HDF5 formats was a huge hindrance for making progress, actually I am barely at peace with CSV files leave alone the complex ones.

But, no worries. Anaconda/Jupyter/SageMaths are things which I would love to Master anyway! I obviously see myself using them for a long time in Future. 

For visualization part, I focused on Bokeh but soon realized that the kind of skill level that the task asked for, was still a few miles away so I explored another wonderful option which is Plotly. I kinda experimented a lot with Plotly and somehow had the Visualizations working on a real-time basis. I see a long road ahead though and I find the journey exiting.

Curious person that I am, I tried to use Plotly with Julia as well but as Julia recently hit 0.4 versions, I think it affected the compatibility between the two. No, surprize there though - Julia is moving fast and is constantly evolving;P

However, it would have made things a lot easier for me - as Pandas don't have a counterpart in Julia and any CSV file would have done just fine in this case.... Well, moving on)

Another thing, I look forward to in coming months is JavaScript - not only the Visualization part but the language itself. I see that language everywhere and of course, there was a lot of HTML and CSS code too, so I think by the end of this learning curve I'll start seeing the Rainbow in this Digital World.

Sunday, October 18, 2015

Python, Julia and JavaScript

I have been dabbling so so much in Python recently and, I see the other two coming up pretty quick in my life - I guess, I do love Interpreted Languages a lot;P

You know, it might seem that I am getting good with Python and having a deeper understanding of Computation, in general - for some reason I can't believe that's really the case.

Sure, I am getting used to various libraries in the language and how to solve problems by thinking within the language but I am yet to use classes properly, yet to use the various awesome features of these languages, so on and so forth. In short, as of now I am only getting-it-done-barely and there is still not a shred of Beautiful Code in my coding.

Nor have I really taken strides as far as my comfort level with Emacs or Atom or the Zsh is concerned. I've been using them here and there for sure, but I think Being Good at Coding is more like thinking like Sherlock Holmes, there are many many ways to do things but there's a short, sweet and clever answer as well. I just need to notice!

In any case, I am happy that Python is really becoming my Mother Tongue in the world of Programming.



Friday, October 16, 2015

Sit for Interviews, nah...!

I know it has been a while since I last posted but believe-you-me, this month has been the busiest one so far, and perhaps the most decisive one in this year. 

I have ended up becoming a co-founder for a StartUp;P

Yeah, it's that exiting and it's all the more fun for me 'cos over at the University my batchmates are sitting for their Interviews with all the jargon that comes along with this thing. I might actually be taking interviews by the end of this year!!

Actually, it just came about in quite an unexpected way as they just sort of told me about this idea a year back and it seemed "different" but certainly not outstanding. But this year, the topic came up again and as I was working on using the BigData and visualization on my college project this year - I suggested why not look into visualization.

Then we thought along this line for a while and eventually they really liked the possibilities that this little thing opens up. So, after a couple of weeks I ended up meeting a couple of Mentors and Investors ( from US and Japan respectively ) regarding the Idea. I have to say that the reviews were awesome! Especially, with the fact that the Japanese apparently don't really give way to emotions this easily or so I have been told;)

You know, I feel quite happy about it all . Remember that during the course of this blog I explored Data Scraping as part of my attempt to learn Songs, of course using Python and Anaconda environment. Guess what?

The experience came in handy for the current StartUp project I am working on! This time I had to take it up to a whole different level and, again, things just fall into place for me. I am really starting to believe in my Gut feeling, you know. It has never led me wrong;P

Will keep you guys posted on the latest happenings,
Stay Awesome!
Change the world for the better=)