Data Wrangling

nets and logos

If you have your data in a database, you may have done some data wrangling already. Data wrangling is the process of collecting, selecting, and transforming your data in order to answer to your research question. Also known as data cleaning, it takes a lot more time than you might think.

In general, it is always a good idea to do data wrangling in either Python or R as they will automatically retain all steps performed (if you do not cancel the code) and therefore it will be easy to review each stage and also demonstrate your steps, if required by reviewers or publishers. At the same time, they allow you to perform similar data cleaning tasks by quickly modifying the original code, thereby speeding up your data wrangling process.