Welcome to Python-izing the SAS Programmer, a blog for SAS programmers like myself who have spent their careers writing data steps, who are wondering about rumors they're hearing about the increased use of Python, and who are curious about poking their head inside the Python door to see what's inside.
Change is always challenging, especially when you've spent so long not changing. But it can be fun and exciting too, and if you're someone who enjoys such challenges, I hope this blog helps you get started. I've spent over 20 years writing SAS code. The data step's SET statement with its automatic iteration through rows of an input data set are second nature to me, to my detriment. Objects, methods, constructors – these, up until a few years ago, were foreign concepts. To this day, every line of Python code I write is still written, to some extent, with SAS concepts on my mind – something I'm still fighting.
I guess that's why I started this blog. Half of the difficulty associated with change is knowing where to start, but I think the challenges are significantly reduced when we start with something familiar. How do I read data using Python? How do I sort it? Merge it? Transpose it? Summarize it? And how do the answers to these questions compare to the corresponding SAS answers? We'll get to these.
But before we do that, we have to know what a Python "data set" looks like, if there is such a thing. And before we do that, we need to take care of some preliminaries, such as how to get started executing code.
As with most blogs, this one will consist of several "installments". Initially I'll publish once or twice weekly. How many will I do altogether? I have no idea. For now, I have several topics in mind that should get you well on your way. The reader is encouraged to play along with the exercises, and especially to test out their own ideas in their own environment (this is how we learn).
While most of my career has been spent in the pharmaceutical industry, that isn't a prerequisite for learning from this blog. At worst, you might run into those kinds of examples, but you'll still be able to work through them.
This is not meant to be a cage fight between SAS and Python, or any other kind of competition. It is not my goal to convince you that one language is better than the other, nor am I encouraging you to drop one in favor of the other. Broadening our horizons is what we're ultimately after here, and that can only help us add value to ourselves and the company for whom we work, and maybe even make us an all-around better person. There will be times when I point out that one handles something better than the other, but don't take it personally. I love them both the same.
If you came here looking for everything Python, you came to the wrong place. The focus really is on how certain aspects compare to what we know in SAS, so we won't come close to covering everything in Python. The reader is encouraged however, to supplement his/her journey through this blog with the Python documentation, which can be found at www.python.org.
Hopefully you don't mind the casual tone of these blogs. It'll be sprinkled into the lessons, but not so over-the-top that we lose sight of what we're learning. We all love what we do, so why not have some fun in the way we talk about it? You'll notice that you are welcome to comment on posts. Let's be sure to keep these relatively professional and civil. We're all on the same side so let's try to help each other out!
With all of that said, let's get started! I know you're probably anxious to get started in data, but before we do that, we need to have a couple installments that help get us set up. In our first, we'll go through some object-oriented preliminaries (no exercises for you here). In our second, we'll get Python installed and start programming.
Write a comment
Zareena (Friday, 12 July 2019 17:42)
Hi, Where can I find what needs to be installed and what all excercises to work on?
Mike (Wednesday, 24 July 2019 15:25)
Zereena, so sorry for such a late response. A great place to start is the Python documentation. It's very extensive and should have some good examples you can try out on your own. That's at docs.python.org. Also, you might want to check out the PIP website - https://pip.pypa.io/en/stable/user_guide/#installing-packages. PIP is an easy-to-use tool for installing third-party packages (such as Pandas). Hope this helps!