Category Archives for Data Science

Cloning Github Repository from Mac Terminal

I’m starting to play with deep learning, machine learning, artificial intelligence in a variety of ways from statistics, linear algebra, calculus, Python via KhanAcademy, DataQuest.io, Coursera courses, Udacity Courses, EdX courses (refreshing my memory in some cases). So I thought I would start to blog about my discoveries which will hopefully help you as well.

I was watching a video by Siraj Rival about Python for Data Science and he had put a code sample up on Github. Github is a repository for code (an online code versioning system) where people post and share code with each other. It’s becoming more of an online resume where employers can see that you’ve actually worked on projects, not just padded your resume ๐Ÿ™‚

So when we find a cool project we want to play with we can download the code to our local machine using Git on our Macs. Git is a code versioning system (maintaining/updating code in an organized way) and Github is an online version of that. If you don’t know how to install Git, check out this article on installing Git.

Instead of downloading a zip file, forking the repo (using Github website to copy the code to my Github account) or using Github for Mac I wanted to download the code from the command line.

In the image above you see a green Clone or download button for a Github project. The project uses Scikit-learn for Python to do data analysis. Click that to see a dropdown where you can copy and paste the URL to the .git file. We’re not going to download the ZIP file. We’re going to pull the files from the Mac terminal instead. Go ahead and open a Mac terminal (it’s under Applications->Utilities). Go to a directory you’d like to install the code in (I use the default which is Users/myusername).

To clone a Github repository you just type:
git clone <URL to repository>

so for us, this is:
git clone https://github.com/llSourcell/gender_classification_challenge.git

This pulls the code down and will make a directory based on the project name (gender_classification_challenge). Now you can cd (change directory) into the gender directory and play with the code like I am going to do!

Recognizing Trends For The Sake of Your Career

In this post, I’m going to talk about various decisions I made which lead to amazing opportunities. I believe you can achieve this predictive capability as well by observing, reading and having the mindset to watch for trends. Of course, you should do things that you are interested in, not just follow trends!

After a few years of doing structural engineering in consulting firms, I realized I wanted to return to my childhood passion, which was computer programming (my first computer was a Timex Sinclair ZX81). So around 1999, I looked into the industry and felt that object-oriented programming was where things were headed (a way to organize code into objects rather than endless lines of code). There happened to be a fast-track program at the University called OOST (object-oriented software technology). We learned different things but I felt that Java was pretty amazing and “free” or open-source (headed by Sun Microsystems at the time) and where things were going. Also, web-based applications were getting pretty interesting (much more powerful than the usual ‘static’ HTML websites), so I decided to work at Servidium which was developing a web-application framework called Jaydoh. Frameworks make it easier to build web apps and allow you to separate the view (HTML – what you see in the browser) from the controller (Java – the logic) which are also usually different skill sets.

Jaydoh was basically competing with Struts (an Apache open source framework) so the challenge to get sales was large, ie. to sell a proprietary framework when an open source one was already available. So I decided that I should get into open source Java instead for the sake of my career. That lead me to work at Digital Oilfield (DO) who was using J2EE (Java Enterprise Edition) to run their apps.

As it happened DO was about to release a new version of their software so they asked me if I wanted to learn something called webMethods. I said sure even though I had no idea what it was (always good to learn new skills). They needed a way to exchange invoice files between companies and were originally thinking of using Java (servlets) unless I could figure out webMethods quickly (which I did). This lead me to learn about the new area of ‘Enterprise Application Integration’ or EAI and B2B (business to business) transactions (exchanging data like invoices and purchase orders between companies essentially). At that point, I realized that this was an important and growing area. I’ve been working in this area ever since (about 2003).

Somewhere along the line people started talking about web services. So instead of applications full of code that are hard to reuse, we started to think about creating web services (similar to functions by accessible on the internet). In the corporate world this became SOAP (simple object access protocol) and on the internet, it became REST (Representational State Transfer). SOAP is pretty complicated compared to REST which is another important fact to take note of.

During my work as an integration consultant, I noticed that new areas were getting some interest such as business process modeling (BPM). I was pretty interested in this as well as it made sense to set up a process (step by step tasks that need to be done in a common business process) and plug in either automated or human-performed tasks. This is a higher-level layer than the integration layer of course. The challenge for me was that none of the companies I was getting called by had these types of opportunities (it was fairly cutting edge at the time). Also as a contractor, you are paid for your expertise so whenever you have a major learning project it’s probably best to join a company as an employee so you can learn the new skills. Another way is to pay for your own training and try to be put on a project with other experienced people (in BPM, for example). This is a bit riskier as you have the knowledge but not the experience.

I decided to keep doing webMethods projects which were lucrative and allowed me to ‘retire’ in my early 40s. In 2010 I moved to a semi-rural area of Eastern Canada but was still taking various webMethods projects with large breaks (usually many months) in between. The last one was only 1 day a week from home which was great because I could work on other things of interest. But in general, this work was getting pretty boring (not much new learning).

A few years ago I finally decided to get my health in order. So after reading a lot of books, I felt that a plant-based diet made the most sense. I ended up losing over 35 pounds, lowered my blood pressure and lower my cholesterol to ‘heart attack proof’ levels. I’m on no medications at age 47. In fact, I recently had to buy 30″ jeans which is crazy to me (I’m 6′ tall). So I recommend working on your ability to search, read books and papers and try to decipher some of the studies (say on Google Scholar) as it can be tricky to depend on an ‘expert’ in the field (many of them disagree with each other). My success with this approach ended up turning into an online business (Potato Strong) with ebooks, a program, a course, and coaching along with various social media channels that I maintain.

During the past few years, webMethods integration opportunities have diminished somewhat for various reasons (licensing fees, software competition, the influx of cheaper and/or offshore labor, etc) so here we are at another decision point. I’ve been working on other things but my mathematical and programming interests seem to keep coming back. I feel like there’s so much more I can do that I didn’t get into. I received a Ford Motor Company scholarship in 1988 which paid all my engineering tuition plus some living expenses (value $18,000), and then won an NSERC scholarship which paid for my Master’s degree.

Lately, I’ve been looking into deep learning, which is a subset of machine learning which is a subset of artificial intelligence. Related to that area is data science. Last year I took a Computational Investing course on Coursera taught by Tucker Balch of Georgia Tech. Google, Facebook, Microsoft, and others are investing billions of dollars in the area of deep learning. Just to give you an idea of how much better computers are getting at this type of work, there are computers winning Jeopardy game shows, beating people at chess and Go, recommending what Netflix shows you might like to watch, tagging photos on Facebook automatically (facial recognition), translating languages (Google translate), not to mention self-driving cars.

If you’re thinking of career longevity, you might want to focus on things that require very high-level knowledge or one-on-one contact (nurse). Even things like taxi/truck drivers could be replaced with self-driving cars. At a minimum, these are fun things to read about and even play with. Keep your eye open for changing trends and technologies that could affect your job security.

Deep Learning and Data Science – New Blog Topic

My website (the one you are on now) has historically been about guitar playing and teaching. I still play or practice every day as it’s a long time passion. I try to focus on one topic at a time, so currently it is using minor pentatonic scales (more so sequences) over jazz progressions (if interested drop me a message – I was working on an ebook about this).

After I lost a bunch of weight eating a plant-based diet (I’m now in 30″ jeans at age 47 at 6′ tall) I created the www.potatostrong.com website along with a ‘Potato Strong’ profile for each of YouTube, Facebook, Instagram, Twitter, Pinterest, Tumblr. That’s been going pretty well and it feels good to help a lot of people lose weight, get off medications and help the animals and the environment.

For Potato Strong I developed a couple ebooks, program, course, and coaching and as I started to make sales I shared this information on a Facebook page called Share Your Passion Online. Every month I shared my total online income which grew from nothing to a modest monthly income that helps pay the bills. I then had it on auto-pilot to some extent (using MeetEdgar and BoardBooster) but I would still post what I ate most days (to help people see what to eat) and also do YouTube videos which are fun. But I needed a new challenge as I love to learn new things.

My background is engineering (I have a Master’s degree) and computer programming (diploma in object-oriented software technology). I went from being an engineer to switching over to software development where I ended up doing integration for large companies using webMethods software (now SoftwareAG).

webMethods contract opportunities have slowed substantially in the past few years. I used to get calls from a lot of recruiters and had a few close relationships with small consulting firms that specialize in this area.

In one of the many books I’ve read lately (can’t remember which one) they suggested thinking about what you liked doing as a child. While this might not work in every case I used to program computers in my basement. It was fun to make the computer do things. I started with a Timex Sinclair computer that used a regular TV and no data storage (I would eventually turn off the computer losing everything) before I added a regular tape recorder. Then I met a friend in high school and I loaned him my Atari game system for his Vic-20 (with a tape recorder). Then I eventually got a Commodore 64 and my high school had PET computers.

I’ve always loved to learn and am constantly reading books on various topics. It’s a blessing and a curse because it’s hard to do the same thing every day especially if there is no learning component. So last year I took a Python course online that involved stock market predictions using Pandas, Numpy, etc. I did very well and was helping others in the forums.

For some reason, I recently started thinking about artificial intelligence, machine learning, and deep learning. It’s a complex area covering algebra, calculus, probability, computer programming and more (which is pretty much in my study background). I’m going to start with data science projects for the most part using a site called dataquest.io. This area touches pretty much every area of work from health care to social media as it helps employers figure out best business practices.

I’ll be posting my discoveries along the way here. Hopefully, I can add some guitar learnings and other topics over time. The topics are categorized in the top menu if you want to focus on one particular area.