The cardinal rules of using spreadsheet programs for data:
-
Put all your variables in columns - Variables are the things you’re measuring, like ‘weight’ or ‘temperature’.
-
Put each observation in its own row - For example, if you are collecting information on books in a library, every row will identify one specific book.
-
Don’t combine multiple pieces of information in one cell - Break down your information into basic types, so that it can be used and sorted in many ways. This may help you in future and is import\ant to enable reuse.
-
Leave the raw data raw - don’t change it! Always work on a copy of your data, no matter if you have received a spreadsheet from someone else, or if you are making modifications to your own data. This means you can always go back to check the original later.
-
Export the cleaned data to a text-based format like CSV (comma-separated values) - This ensures that anyone can use the data and is the format required by most data repositories.
Other important tips for populating your table:
-
Avoid using multiple tables within one spreadsheet.
-
Avoid spreading data across multiple tabs (but do use a new tab to record data cleaning or manipulations).
-
Record zeros as zeros. Do not leave an empty cell to identify 0 values.
-
Use an appropriate null value to record missing data. 0 is different than no data and your table should represent it.
-
Don’t use formatting to convey information or to make your spreadsheet look pretty. Anything that is done on the spreadsheet that will combine cell and use colour to convey information will not be read by other software.
-
Place comments in a separate column. Only data should go into your table. All the additional information must go on an associated metadata file.
-
Record units in column headers. In 6 months' time you won't remember which unit you used.
-
Avoid spaces, numbers and special characters in column headers.
-
Avoid special characters in your data. Not all software is able to encode them.


