As I’ve said on the post before , I’ll explore the process of researching and there are the links I’ve looked and that I find the most interesting.
One of the posts already provides a dataset, which I’ll use for learning things from the others as well.
What is a Time Series?
Through my research this question came up, my first one was: “any data that has a timestamp attached to it data points”. But the most descriptive one I could find was:
“Time Series is a sequence of well-defined data points measured at a consistent time intervals over a period of time” source.
Data
The data used on this post will be the “Daily Female Birth” dataset. Basically is made of the tuples (date, number of females born).
Example:
02/08/1995, 34
03/08/1995, 45
…
Code
I just started following the tutorials and dive in on the code, mostly to see what it’s possible to do so here is the notebook and what I thought needed explaining, I did (if you have any suggestion for improvement, please, let me know on the comments, I’d appreciate).
My setup is only Python (as the links provided are) and I use anaconda as the package/environment manager. The first thing I do is download and import the packages I’m going to use and read the data into a Series, as pointed out on the article:
After that, one of the most useful functions on Pandas Dataframes, Series, etc: describe(), as pointed out in the documentation:
“Generates descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN
values.” For those who don’t know what NaN values are, they are the simplest representation of missing data. This a whole different subject on data analysis that I don’t want to touch yet.
Last but not leasts come how to visualize the time series data, a simple “plot()” function can help with that, I’ve found now that my research was too superficial since all 3 articles don’t add much to each other, I hope that tomorrow I’ll bring new articles to the research and also explore some new ways to analyze time series.
Plotting from january 1st to April 1st:
Deixe um comentário