I mentor new (💫) and aspiring data scientists enter (🚪) and level up (📈) in the field. I help data pros find work they love (❤️) and that loves them back.
Image Credit: Author’s Illustration “Resource Bank.”

Table Of Contents

1) Building A Professional Portfolio   Getting A Portfolio Started
Adding & Enhancing A Portfolio
Contribute To Other Projects
2) Data Science & Machine Learning Making Fictional Data
(For testing, training, demonstration)
K-Nearest Neighbors
Distance Measures
Automating Data Collection
3) Career Advice Big Career Mistakes (Big Ones)
Career Paths
4) Data Culture Data Driven Culture?
What Is A Data Set?
Columns, Variables, Dimensions . . .
5) Professional Writing Resources Common Writing Mistakes
Research Questions
6) Personal Meets Professional Coming Out

Building A Professional Portfolio

Building a professional portfolio takes time and dedication. These articles provide advice on how to get started. Once started…

Last Updated: March 27, 2021.

Attribution Strategy

My goal is to provide readers with pleasant opportunity to learn about data science, data-related careers, and other professional or personal topics. In providing that visually and aesthetically pleasing experience I sometimes use a variety of images. I aim to provide attribution and credit. The information below provides additional information related to the images I use online.

Attribution Keywords

“Via Design Pickle” — The Design Pickle service provides visual design services on a monthly subscription basis. The artists at Design Pickle provide subscribers with images, illustrations, graphics, and other related productions. Design Pickle subscribers own the productions…


This article uses fictional data, previously generated using the code in an earlier article to illustrate the k-nearest neighbors classification algorithm. Readers can use this article as a cookbook for executing classification algorithms with the k-nearest neighbors algorithms.

Both a Jupyter notebook and a YouTube instructional video accompany this article. I placed links to these additional resources at the end of the article.

The Algorithm

This algorithm, k-nearest neighbors, is one of the simplest supervised machine learning algorithms available for classification. Frequently, the k-nearest neighbors algorithm performs as well as many other more sophisticated machine learning algorithm options.

The underlying principles at…

Image Credit: Author’s Illustration. Data Science Career Paths.


There are many ways to pursue a career in data science. This article covers six possibilities. Of all the possibilities two things to keep in mind. One, there are more than seven paths. Two, there is no right or wrong way to go about it.

From my frequent and ongoing discussions with other data professionals, I wanted to share a summary of common paths towards data science.

For example, I spent most of my career working in education. In my career, I have worked in the classroom. I taught abroad. Also, I moved into education administration before I finally transitioned…

Image Credit: Author’s original illustration.

Introduction & Method

After a recent article on the topic of sourcing federal data, in which I show how to use Python to automate the process of getting data from the US Department of Education (US DOE) and then assembling that data into a panel data set, I started getting questions.

Why would you go through the trouble of writing code for this? Wouldn’t it be faster to just download the files and then use point and click to assemble the data?

The answer is, it depends. I did an experiment. In executing this experiment I recorded myself as I used point and…

Have you considered a presentation on this topic (spider and radar plots) for Stata 2021 conference?

Image Credit: Author’s original illustration.

Since writing about this topic earlier, a handful of folks throughout the community have shared with me their own picks for tools that generate fictional data. I evaluated three tools to see how well they can produce the fictional data I previously wrote about. Here are the results of how I evaluated these tools.

1) Faker — Gets very close.

2) — Gets close.

3) Mockaroo — Gets very close.

4) On Your Own — Perfect Match.

In no particular order here they are. Below I write a bit about each. I evaluate each on a three-point scale to…

About once a week I get (or see online) a question from a fellow data scientist or aspiring data scientist. Where can I get a data set to play with? Or, I’m looking for an interesting data set to learn with, any suggestions?

There are plenty of interesting data sets out and about. But why not make your own data? Making your own fictional data is also a useful skill when you need data for testing or demonstration purposes. This article will show you how to generate fictional data (this is one set of many methods).

At the bottom of…

Photo by Pang Yuhao on Unsplash


Imagine a scenario in which a college or university receives criticism in the form of negative print, online, and social media attention. Suppose that attention focuses on the institution’s undergraduate application fee rates.

How can the institution respond? If your thought on responding to this media attention involved comparing the institution, to other similar institutions, this article is for you. This negative media attention hypothetical provides a case study for applied use of distance measures below. Institutions can use these distance measures to identify meaningful comparison groups. …

Photo Credit: Unspash. (Original). Using surveys to get feedback from students online.


Getting feedback from students has always been a priority of mine. I collect feedback early and often. My practice has followed a specific format that asks open questions.

  • What is something that is going well so far? What, if anything, would you like to see more of?
  • What, if anything, isn’t going well so far? What would you like to see less of?
  • What questions about the course requirements, assignments, and expectations do you have?
  • What questions about the course material do you have?

I’m also finding, and I think this is perhaps due to a lack of in-person interactions…

Adam Ross Nelson

