Guidance

for users, publishers and sysadmins

Dataset form for adding/updating datasets

NB You can’t use this method for publishing INSPIRE/Location data. See more here.

Add a dataset

  1. Ensure you have a user account on data.gov.uk and it has been assigned permission for your organization, either as editor or admin. See: Becoming an editor

  2. There are two methods:

    A. Click on the “Publisher tools” button and choose “Add a new dataset”. publisher tools button method

    B. Go to your publisher page and click “Add a new dataset”. publisher page method (Note: if you are an editor, some of these options will not be shown to you)

  3. Complete the form using the guidance below.

  4. Click ‘Save and finish’. There is more about this below.

Edit a dataset

  1. Ensure you have a user account on data.gov.uk and it has been assigned permission for your organization, either as editor or admin. See: Becoming an editor

  2. Find the dataset you wish to edit and click “Edit dataset properties” edit dataset link

  3. Complete the form using the guidance below.

  4. Click ‘Save and finish’. There is more about this below.

Form fields guidance

The publication/editing wizard follows a tab structure. When creating new datasets you will have to click on next at the bottom of the wizard to continue.

Naming the dataset

Your dataset’s name is most important to get right. Otherwise users will not be able to find it or not understand what it is in the search results, whether they are using Google or data.gov.uk’s own search.

Don’t include:

You can’t edit the URL once the dataset has been published.

URL

When you create a dataset, a URL for your dataset will be created from the name that you enter.

url available

If the generated URL is already used by another dataset then a warning will be displayed saying ‘This URL is not available’. You can change either the name of the dataset to generate a new URL or manually change the URL until a unused value is found. Including the name of your organisation in the title is an easy way to make it unique.

url unavailable

When you edit a dataset, the URL will not be editable. This is to provide continuity for people when they refer to the dataset. This is particularly important to users of the API.

Information icons

Many fields have an ‘information’ icon, which displays more guidance if you hover the mouse over it.

field hint

Timeseries vs Single File

You can publish data on data.gov.uk as either a ‘single file’ or ‘time series’ dataset.

Use a single file dataset if the data only needs to be published once.

Publish your dataset as a time series dataset if it needs to be updated, eg every month.

Monthly datasets problem

Data files

data files

Although it says “files” you should also add links to APIs here (eg SPARQL, WMS, etc). Usually that is the root URL of an API that might not return much by default, but it is still good to add it here. If you have a web page that helps you call SPARQL queries then a link to that would go in the Additional Resources section - see below.

Note about gov.uk files

For files on gov.uk (using ‘Whitehall’, appearing hear: https://www.gov.uk/government/publications), the data URL you supply should be to the download link, not the web page. i.e. where it says “Download as CSV”, right-click and click “Copy link address”, before right-clicking back on the field on the data.gov.uk form and clicking “Paste”.

NB that if you update the file in future on gov.uk, it gives you a new download URL. Therefore you need to update the URL on data.gov.uk too.

Time series

time series files

If you select time series then you can specify the “Update Frequency” and get the “Date” column to fill in for each file.

The benefit of choosing ‘time series’ is that the files will be displayed to the public ordered in date order, and previous years hidden by default. For more than a few files, this is a much better experience for users. In addition, the date is available to users in a machine-readable format.

‘Check’ button

check button

The ‘check’ functionality allows the system to identify the format of the file and automatically add it to the record, avoiding chances of different spellings. The ‘check all URLs’ option allows you to check that the URLs entered are all active and working.

If you encounter problems when clicking the ‘check’ or ‘check all URLs’ buttons (no format appears in the format box or the check all URLs process takes too long ) don’t use them. Manually enter the format of your file (which SHOULD always be CSV or another open format, NOT XLS, HTML OR PDF). This issue arises because some older browsers may not work well with this feature, and the file type is determined simply by the URL extension or mimetype.

Description and themes

description field

Along with the title, the first sentence of the description will be shown in search results. This sentence should be 140 characters or less. Search engines will automatically shorten any descriptions that are longer than this.

The description should explain:

Your description must be written in plain English. Include any keywords that you didn’t use in the title to help users find your dataset.

Themes

theme

Each dataset on data.gov.uk is allocated a theme.

To have your dataset allocated to a theme, click the ‘update themes’ button after you’ve added your description. Your dataset will be automatically associated with a ‘primary theme’ and, in some cases, a ‘secondary theme’ too.

The themes are decided by the most frequently used words in the title and description. For example, if your description features the words ‘payment’ or ‘supplier’, your dataset will be allocated the ‘Government spending’ theme.

theme generation

If you think that the themes are inaccurate, please expand the description to mention the topic of the dataset better. If that still isn’t working then follow the instructions on the form to add tags or contact us.

Licence

Select the licence that the data is being released as. As the form states, this should be OGL for nearly all of central government and its agencies. Publishers should select OGL or other open licence in the list if at all possible. However if it is not possible then select “other” from the drop-down and type the licence details in the box which appears.

licence

Publisher

Select the organization that publishes this data. The only available options on this list are the publishers that you are an Editor or Admin for.

Contact details

When you select the publisher in the drop down list, the contact details from our records associated with the publisher will be displayed. If these details are not suitable for this particular dataset, then you can edit them here and they are stored as an exception.

Note: this feature is currently not working correctly. The contact details associated with the publisher will not be displayed in the form. However they will be displayed by default when you view the dataset, unless you provide different ones in this form.

publisher

Additional resources

Here you can enter links to any other document or web page that provides more information on the dataset.

additional resources

Please ignore the ‘Scraper name’ column and leave its contents blank.

Temporal coverage

Where available it is important to include the time period for the data. This may be a single date or a range of dates. If a single date is being entered please leave the second box blank.

The dates should be in the format DD/MM/YYYY, eg 21/03/2007. Optionally the date can include the time in the format HH:MM, eg 07:45 31/03/2006.

temporal coverage

Geographic coverage

It is also useful to include geographic coverage for the dataset. Use the checkboxes to select one or more areas which are covered by the data.

geographic coverage

Extras

Other odd pieces of information may be attached to a dataset, such as fields that used to be on the form but have since been deprecated. These are collected together in the ‘Extras’ tab. However editing this is reserved for sysadmins, so contact us if it is important to change this information on an existing record.

extras

Form submission

Form errors

When you click “Save and Finish”, if there are problems with any field then it will list them at the top, and also with yellow warning triangles on the appropriate tabs, and when you click on the tab red sharded boxes say what the problem is next to the appropriate field eg form errors

Occasionally new mandatory fields are added to the form. This can cause form errors to appear when you edit an existing dataset, even if you’ve not changed anything. In this case, fill in the new field to be able to save it successfully.

If, after clicking ‘Save and finish’, you are still in the editing screen then: YOUR CHANGES HAVE NOT BEEN SAVED!

Caching

The data.gov.uk site is cached for 1 hour for general users, but not if you are logged in. This means that if you add a dataset or make a change, you will see it has changed straight away, but it will take up to an hour before it becomes visible to another person visiting the site (ie someone who does not log in). Please be aware of this when notifying people or announcing releases.