Upd

I haven’t posted anything for quite a while, but actually I keep learning. It’s always somewhat sad to see these abandoned blogs created for peer-learning with a couple of posts and then no updates, so you just don’t know if their authors are still learning or gave up on it. Well, I haven’t. True, I’m more into platform MOOCs at the moment, so I’m not using this blog for peer-learning purposes directly. But I generally like this international open peer-learning project and I’m going to update this blog from time to time.

There’s a good occasion for this post: I’ve just completed An Introduction to Interactive Programming in Python at Coursera. I’ve finally done it having failed two previous attempts. It was challenging and I’m not sure I’d have made it if I hadn’t done some preparational job at Codecademy and with the help of Zed Shaw’s ‘Learn Python: The Hard Way‘ (a great educational project by the way).

Just for show, here are the links to the mini-projects I completed during the course. I’m providing the links to my code in Codesculptor, an online application created by one of the instructors for writing and running Python code. In case someone wants to have a look, the best way to do it is by using Chrome (using Mozilla and other browsers may lead to some bugs).

This is actually the first part in Fundamentals of Computing specialisation. Next course in this sequence, Principles of Computing, is going to start in February 2015 and I’m totally going to try it. Before it begins, I’m going to have some fun at Khan Academy.

Finally, some courses I’d like to have a closer look at at a certain point. Maybe someone will find them fascinating as well. If somebody has already dealt with some of them, it would be great if you shared your opinion.

Advertisements

First completed course at Coursera

A week ago, I completed Computing for Data Analysis by Prof. Roger Peng at Coursera. This course was described as an introduction to the R language. Well, this might have been somewhat confusing, because it was an introductory course indeed for those who were totally new to R. But not for those who were total newbies in programming in general, which wasn’t actually directly mentioned in the course description. Judging by numerous complains at the discussion forum within the course, some people really were having hard time trying to figure out where to start having no programming experience whatsoever.

On the other hand, even a very distant familiarity with programming basics in Python made things a bit more tolerable to me than they would have been had I never ever seen things like an IDE or a for-loop before. So for me the course was rather challenging and even frustrating at times, but to my huge surprise I was able to complete the assignments. This doesn’t mean of course that I have perfectly understood, digested and mastered all the material provided. But after the course I really feel much more confident in the R environment. What is even more important, the course helped me to map my skills, so now I know what I need to learn better, where and how I can look for help and which spots in my knowledge I can rely on. All in all, I’m glad I took this course. Thanks to Dr. Peng and his wonderful teaching assistants who made a huge lot of job trying to retell the course material so that even total newbies could keep up.

By the way, I think the course is still available as archive at Coursera. Its video lectures are also available at YouTube.

Also, I must admit, I have developed Stockholm Syndrome began to like R.

And I’ve spent almost two notebooks on it, because I really feel more confident when I make notes on the way.

20131026_102208

Now, as a follow-up, I played a bit with the dataset, which was used for our last assignment focused on regular expressions. We worked with the homicide data from Baltimore Sun site, which provides an interactive application to navigate these data, but doesn’t provide them in a downloadable format. So Dr. Peng simply copied them from the page source and pasted into a text file. Here it is.

For our assignment we had to write two functions. One had to count the number of victims given the cause of death. The other had to count the number of victims of a given age.

I wanted to find out if there are any preferred ways of murder given a gender. I also wanted to visualise my results. To this end, I first wrote a function that sorted victims by gender given a cause and returned the result as a data frame. Then I wrote another function that joined the output of the first one into a general data frame for all the causes presented in the dataset. I realize my code is not exactly neat and nice, but I’m glad that at least it works.

And well, I actually found out that the most common cause of violent death in Baltimore in the period from 2007 to 2012 was shooting; that out of 1245 observations in 1126 cases victims are male, so it looks like this:

bar_chart_by_gender

Also, the only category in which female victims prevail is asphyxiation. So speaking about preferences in killing tools given gender, this chart might be more instructive.

stacked_barchart_by_cause

Well, for more sophisticated data analysis I’ve yet to learn loads of Statistics. By the way, as to Statistics, I’m still taking Statistics One by Prof. Andrew Conway at Coursera. Although it seemed a bit boring at the beginning, now it’s getting more and more interesting.

Also I have completed the Python course at Codecademy. And immediately started a course in JavaScript. Because I like Codecademy. And because I don’t have enough time right now to focus on learning API with Python there. Never mind that I’m currently doing Introduction to Interactive Programming in Python at Coursera. I promise, I’ll quit it, as soon as it becomes too challenging to be combined with Statistics and Data Analysis, which starts on October 28th.

All this stuff is supposed to be completed by January. I must say, now I feel a strongest urge to get down to something a bit more fundamental, like maths and computer science basics.

Links Links Links

A new bunch of links to the resources regarding statistics etc. that seem to me helpful:

Introduction to Statistics

This is an archive of an introductory statistics course at Coursera Statistics: Making Sense of Data by Alison Gibbs, Jeffrey Rosenthal (University of Toronto).

The authors of the course kindly provided a list of recommended literature. I don’t think it would be a crime to reproduce it here. So, they recommended three ‘traditional books’:

  • Introduction to the Practice of Statistics, by David S. Moore and George P. McCabe. (The book is currently in its fifth edition, but any edition will do.)
  • Stats: Data and Models, Canadian edition, by Richard D. De Veaux, Paul F. Velleman, David E. Bock, Augustin M. Vukov, and Augustine C.M. Wong. (The original version of the book, by the first three authors only, is also recommended.)
  • Statistics, by David Freedman, Robert Pisani, and Roger Purves.

And three online resources:

  • OpenIntro Statistics, by David M. Diez, Christopher D. Barr, and Mine Cetinkaya-Rundel. The cool thing about this one is that it’s not just a book, it’s a whole learning tool including labs and some instructions on using R.
  • Online Statistics Education, by David M. Lane, David Scott, Mikki Hebi, Rudy Guerra, Dan Osherson, and Heidi Zimmer
  • HyperStat Online, by David M. Lane
  • StatPrimer, by B. Burt Gerstman

R

Statistics and Python

And last, a couple of books kindly recommended by a great person at P2PU. These connect statistics to programming in Python:

Code sharing options

As I’m proceeding with Python MOOC, I had to choose a way to share my code with my peers. There are many options in fact. Here are some of them:

GitHub 

This was recommended by the MOOC instructions. This is a multifunctional platform that allows you to create repositories, gists and forks, follow users, publish privately or openly, download codes and leave comments. What I also like about it is that you can follow users. For sharing homework gists might be the best option.

github

There are two shortcomings though:

  • You can’t publish your code without registration
  • Some users complain that the interface is a bit too complicated, so it takes time to get used to it

Pastebin

This is an extremely easy to use sharing tool. Actually, what you first see at the main page is a box where you can paste your code. You don’t have to register to do it (so you simply have a link that you can later share). You can also set the expiration time for each publication (from 10 minutes to never). And you can make it public, unlisted or private (for members only). You can also register if you like (I did to keep my homework in order).

Pastebin

Shortcomings:

  • I haven’t seen any commenting option, which might be good for feedback and revision while learning
  • I also couldn’t find any option to follow other members.

DPaste 

This is a very minimalistic service. You can’t register, you only can paste your code and save it. After you do it, it will stay there for 30 days and then it’ll be automatically deleted. So it’s good for quick sharing purposes, but not for continuous and systematic use.

dpaste

Also I recently found Bitbucket 

But I haven’t explored it yet. If anyone has some experience, please share. There are some explanations as to how to use it though: Bitbucket 101.

As for me, I’m currently using GitHub and Pastebin, because GitHub looks like a wonderful working space and Pastebin is good for sharing with those who are scared of GitHub:

Python MOOC – Week 2

I’m going to sum up the experiences of the past week and share what I managed to find out.

First off, I really like the way the MOOC is organised. Especially the way it encourages team work and p2p-learning process. First the instruction was to sign up for OpenStudy, which is very good in terms of mutual help and revision. But there’s a problem there. You can ask questions there alright, but you can ask only one question at a time. That is, after you asked your question, it appears on the questions wire and everyone can see it and answer it. But if you want to ask another question, you’ll have to mark the current one as ‘closed’ and only then you’ll have an option to ask a new one. ‘Closed’ means that it is removed from the wire shown by default (to the list of closed questions) and if you haven’t received the answer so far, there’s a chance you’ll never have it because nobody will notice the question.

2013-06-30 20_32_53-OpenStudy

Ah yes, also OpenStudy is often down, so you sometimes simply can’t use it.

But there are great options outside. First is that the MOOC organisers divided all MOOCsters into teams and provided them with mailing list addresses, so some questions cans be asked and answered in small groups and you have no limitations here.

Finally, there’s one more learning space I discovered only yesterday and haven’t tried yet, but it looks great. I mean Groups at Codecademy (you have to sign in to see the page). Although I’ve been using Codecademy for quite a while now, I didn’t know about their existence. Of course I immediately joined Python for Beginners group. I hope it’ll be a great experience.

Now a couple of words about this week’s homework. This week was rather challenging for me, because I was struggling to understand how loops work, especially the for loop. One of the tasks was to write a code that calculates exponentials using a for loop. Thanks to my team mates who helped me figure out what the task was about in the first place  – that is that the task should be executed without using the in-built exponentiation (**) option.

Now, I had dealt with for loops at Codecademy and found them rather easy. This is what I basically imagined:

for i in range(1, 10, 2):

    print i

So it does what you tell it to with all the items in a range.

But in this case a possible resulting code I got after many efforts (and quite a bit of guesswork, I admit) looks like this:

base = input("Enter base: ")

exp = input("Enter exponent: ")

x = base

for n in range(1, exp):

    x *= base

print x

So after I wrote it, I still had a question: how are for n in range(1, exp): and x *= base connected if there are no obvious operations in which n (the items from the range) are mentioned? The answer is obviously that they don’t have to be mentioned. That is, the for loop in this case is used to show the computer how many times the operation must be repeated.

This is what I realised after reading this awesome article about loops in Python. And I also realised that there’s a great way to see what programme does by adding print statements that reflect the process step by step. Like so:

base = input("Enter base: ")

exp = input("Enter exponent: ")

x = base

for n in range(1, exp):

    x *= base

    print x # This shows what's going on in the process

print x

So for instance if we have base 5 and exp 4, the output will be:

25

125

625

625

Also one of my team mates kindly recommended me to read Learning Python by Mark Lutz (I found out on the way that there’s a whole site about it).

Finally, I played with PyScripter IDE and explored some code sharing options, which I’m going to describe soon.

Oh, by the way, if some peers want to have a look at my whole homework (with the exception of optional tasks I’ll get back to them a bit later), it’s here: https://gist.github.com/ansakoy

Python MOOC – Week 1 UPD

I know by experience that ‘next week’ is always full of unpredictable work, sudden meetings and other distracting stuff, so I decided to do my best at the weekend to play with Python. Kudos to Codecademy, once again. The first week’s homework was really easy (but good for revision), while only a month ago I’d feel totally frightened by it.

Just tried out OpenStudy. I was absolutely resolved to be using IDLE for the rest of my life the course, but one peer there endorsed another IDE (Interpreted Development Environment) called PyScripter. So I went and checked it out. Not that at my level it made a huge difference, but I like it that PyScripter has a compact layout that works in one window instead of two, unlike IDLE that has separate shell and text editor.

PyScripter:

2013-06-23 19_19_51-PyScripter - module1_

 

IDLE:

 

2013-06-23 19_21_19-Python Shell

 

I think I’ll try using both and see which is best for me.

Also we had an illustrative task in natural language processing (exercise 1.11). We were given a sentence Alice saw the boy on the hill with the telescope. And we had to sketch the two possible interpretations of this sentence. Drawing in MS Paint with a mouse – what a pleasure!

Exercise 1

Python MOOC – Week 1

2013-06-23 07_43_22-The Mechanical MOOC – A Gentle Introduction to Python _ Free range open learning

So, a new (the fourth, as far as I understand) sequence of Python Mechanical MOOC officially started a week ago. This week happened to be extremely busy in my case, so I actually had less time for learning than I hoped I would. But thanks to the Codecademy lessons I took some time ago, the first bunch of tasks didn’t contain too much new information for me. But at the same time it contained quite a number of fascinating and revealing details. For one, I found out from this video lecture that some languages allow using false indentation. That is, unlike Python where indentation is the only way to make a script work properly, many other languages use punctuation to separate statements. But indentation is still required by convention to make a programme clearly readable and its semantics more obvious from its structure. So to make people think that the programme does something different from what it really does, some coders may use this false indentation e.g. in Java or C. But not in Python however.

Also, as I think that during these 8 weeks’ period Python is supposed to be my primary learning focus, I decided to take into account some additional Python courses that might provide a better understanding of what’s going on. One of them is Python Programming 101 at P2PU. And actually there’s a lot of additional information there. For instance, there’s a list of Python compatible text editors. What I like best about it is peer reviews of the editors they tried. So I’ll have to save this for the future:

But for now I’m using IDLE, because I don’t have enough time to try all of them right now. Although I’ve installed Notepad ++ just in case.

Also I’m looking forward to getting involved in OpenStudy communication, but I haven’t yet, because I’ve been a bit overloaded (like a + operator) with work.