We'd love to hear from you...
Your feedback helps us make the Meltwater API suite better. If you'd be happy to have a quick call with our product team please let us know... Book a Call
Exports - API Quickstart
In this tutorial we’ll cover the basics of creating an earned media export using the API.
Note that you need access to the earned media exports feature in your API package to use this feature.
Before you start
Take a look at the Platform Overview guide to understand the key concepts of the Meltwater platform.
To run through this tutorial, you will need:
- Your Meltwater API token
- A Saved Search in your Meltwater App account
Types of export
There are two types of exports:
- One-time exports - these exports are run once, the data will not be refreshed automatically.
- Recurring exports - these exports are run on a schedule you specify. Each time the export runs the data for the export is overwritten.
Authentication
You need to provide your API token when you call any Meltwater API endpoint.
You provide the API token as a header called apikey
, for example:
curl -X GET \
--url "https://api.meltwater.com/v3/searches" \
--header "Accept: application/json" \
--header "apikey: **********"
For full details take a look at the Authentication page.
Obtaining a Meltwater search
Exports are created with an existing Meltwater search.
In this tutorial we’ll cover using an existing search, but you can create / edit searches using the API. See the Managing Saved Searches guide for more details.
To create an export you need to provide the id
of the required search. You can use the GET /v3/searches
endpoint to list your current searches:
curl -X GET \
--url "https://api.meltwater.com/v3/searches" \
--header "Accept: application/json" \
--header "apikey: **********"
Example response:
{
"searches": [
{
"updated": "2023-01-10T14:42:10.000Z",
"name": "Elon Musk",
"id": 2382415
}
]
}
Understanding common export options
The following options are available for both one-time and recurring exports.
Choosing an output template
The API allows you to choose from a selection of templates for your export, which you specify using the template
field in your export request. The available templates are documented on the Export & Streaming Output Templates page.
We recommend using the general purpose “API JSON” template for most integrations, as this includes all the data fields most customers need.
To use the “API JSON” template you would use the following as part of your export request:
"template": {
"name": "api.json"
}
If you do not specify a template in your request the API will use the legacy output template as documented here.
Controlling data volumes using sampling
When creating an export, you can provide optional sampling parameters to set a maximum document count and/or percentage sample.
This feature allows you to control the amount of documents you export, and so stay within your export limits, plus also export representative data for high-volume topics. Sampling is supported for both one-time and recurring exports, and returns a random sample across the matching documents that would be in a full export.
There are two parameters that control sampling of results:
count
- the maximum number of documents you’d like to retrieve. Defaults to2,000,000
. Maximum value is 2 million.percentage
- the percentage of results you’d like to retrieve. Defaults to100
.
Please note, that the sampling process is approximate in that the number of results will be within ±10% of your parameters.
Example 1 - return a 1% sample of matching documents:
"sample": {
"percentage": 1.0
}
Example 2 - return up to 50,000 documents:
"sample": {
"count": 50000
}
Example 3 - return a 10% sample of documents, but if this results in more than 1,000 documents, reduce the sample rate to limit the total results to 1,000 documents:
"sample": {
"count": 1000,
"percentage": 10.0
}
By default if you do not specify these parameters your export will contain up to 2 million documents as a 100% sample. If your export request matches more than 2 million documents, then it will be sampled down to 2 million results.
Classification and filtering using Custom Categories
Custom Categories are a feature supported in Explore which allows you to categorize documents based upon a boolean query.
Note that currently the API doesn’t support creating or managing Custom Categories, so they must be managed within Explore.
You can use Custom Categories to both classify and filter export results.
To fetch the list of Custom Categories in your account call the GET /v3/custom_categories
endpoint:
curl -X GET \
--url "https://api.meltwater.com/v3/custom_categories" \
--header "Accept: application/json" \
--header "apikey: **********"
Example response:
{
"count": 2,
"custom_categories": [
{
"id": 123,
"name": "Japanese cars",
"type": "include"
},
{
"id": 456,
"name": "German cars",
"type": "exclude"
}
]
}
Use the IDs given in this response to specify categories when creating your one-time or recurring export.
Providing a list of Custom Category APIs in your export request will classify results with the categories. The custom.custom_categories
field for each document in the result will tell you which categories were matched.
Additionally, you can specify whether to return only documents with at least one category match by specifying the filter
parameter as true
. This option defaults to false
meaning all results are returned regardless of whether they match a category specified.
As an example, this configuration would classify export results with the categories 123
and 321
, and only return documents which match at least one category:
{
"custom_categories": {
"ids": [
123,
321
],
"filter": true
}
}
Creating a one-time export
One-time exports are created using the POST /v3/exports/one-time
endpoint. You need to provide a start_date
, end_date
, format
and a search_id
to create an export.
Note that times are provided in UTC timezone. This is required for one-time exports.
curl --location 'https://api.meltwater.com/v3/exports/one-time' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'apikey: **********' \
--data '{
"onetime_export": {
"search_ids": [<search_id>],
"start_date": "2024-01-01T00:00:00Z",
"end_date": "2024-02-01T00:00:00Z",
"sample": {
"count": 1000,
"percentage": 10.0
},
"template": {
"name": "api.json"
}
}
}'
Example response:
{
"onetime_export": {
"updated_at": "2024-04-02T11:23:40.000000",
"tags": [],
"status_reason": "Export run has not completed yet",
"status": "PENDING",
"start_date": "2024-01-01T00:00:00.000000Z",
"searches": [
{
"id": <search_id>,
"name": <search_name>
}
],
"sample": {
"count": 1000,
"percentage": 10
},
"inserted_at": "2024-04-02T11:23:40.000000",
"id": <export_id>,
"end_date": "2024-02-01T00:00:00.000000Z",
"data_url": <data_url>,
"template": {
"name": "api.json"
}
}
}
Fetching one-time export results
You can check the status of a one-time export using the GET /v3/exports/one-time/<export_id>
endpoint.
curl -X GET \
--url "https://api.meltwater.com/v3/exports/one-time/<export_id>" \
--header "Accept: application/json" \
--header "apikey: **********"
Example response:
{
"onetime_export": {
"updated_at": "2024-04-02T11:24:32.000000",
"tags": [],
"status_reason": "",
"status": "FINISHED",
"start_date": "2024-01-01T00:00:00.000000Z",
"searches": [
{
"id": <search_id>,
"name": <search_name>
}
],
"sample": {
"count": 1000,
"percentage": 10
},
"inserted_at": "2024-04-02T11:23:40.000000",
"id": <export_id>,
"end_date": "2024-02-01T00:00:00.000000Z",
"data_url": <data_url>,
"template": {
"name": "api.json"
}
}
}
Once the status is FINISHED
there will be results in JSON format at the data_url
. If the status is still PENDING
the data_url
will return a 403
status code.
Understanding export results
When you specify a CSV template for your output, the data_url
will point to a CSV file for you to access, with the CSV containing columns as specified on the Export & Streaming Output Templates page.
For JSON templates, the structure of the result is as follows:
{
"request": {
"company_id": <the id of the account that owns the inputs used>,
"count": <number of results>,
"export_id": <the id of the export in the Meltwater system>,
"inputs": [<the inputs used for the export]>,
"period": {
"start": <start of the export period requested>,
"end": <end of the export period requested>
},
"status": <status of the export>
},
"docs": [
<an object for each document in the export, according to the chosen template>
]
}
Note that prior to the templates feature being introduced, exports used a legacy output format as described here.
Creating a recurring export
Recurring exports run on a schedule you specify. Each time the schedule runs it overwrites the data provided at the data_url
.
Specifying a time window and schedule
When you create a recurring export you have a number of parameters you can use to control the schedule and the period of data each run should include.
The window_time_unit
parameter sets the frequency of the schedule, you can choose:
DAY
- exports are run dailyWEEK
- exports are run weeklyMONTH
- exports are run monthly
The window_size
allows you to specify the period of data to include in each export. Think of this value as multiple of the window_time_unit
you chose. For example, specifying window_time_unit
as DAY
and window_size
as 2
will set a recurring export to run every day which will include the last 2 days of data in each run.
You can use the following parameters to specify precisely the window of data included for each run:
- For daily exports you can specify the daily start time with
window_time
- For weekly exports you can specify the day of the week with
window_weekday
and start time withwindow_time
- For monthly exports you can specify the day of the month with
window_monthday
and start time withwindow_time
Note that you can use the timezone
parameter to specify the timezone for your window_time
. The timezone must be a valid zone as detailed in the IANA database.
As a full example, the following parameters create a recurring export that will run every day at 09:00 UTC including the last 7 days of data.
{
"window_time_unit": "DAY",
"window_size": 7,
"window_time": "09:00:00",
"timezone": "Etc/UTC"
}
Default values for reccuring export attributes are as follows:
window_time
:"00:00:00"
window_weekday
:1
(Monday)window_monthday
:1
timezone
:Etc/UTC
Creating a recurring export
Recurring exports are created using the POST /v3/exports/recurring
endpoint.
curl --location 'https://api.meltwater.com/v3/exports/recurring' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'apikey: **********' \
--data '{
"recurring_export": {
"search_ids": [<search_id>],
"window_time_unit": "DAY",
"window_time": "09:00:00",
"window_size": 7,
"timezone": "Etc/UTC",
"sample": {
"count": 1000,
"percentage": 10.0
},
"template": {
"name": "api.json"
}
}
}'
Example response:
{
"recurring_export": {
"updated_at": "2024-04-02T11:34:14.000000",
"tags": [],
"timezone": "Etc/UTC",
"status_reason": "Export run has not completed yet",
"status": "PENDING",
"next_run_date": "2024-04-03T09:30:00Z",
"searches": [
{
"id": <search_id>,
"name": <search_name>
}
],
"sample": {
"count": 1000,
"percentage": 10
},
"inserted_at": "2024-04-02T11:34:14.000000",
"id": <export_id>,
"data_url": <data_url>,
"window_time_unit": "DAY",
"window_size": 7,
"window_time": "09:00:00",
"template": {
"name": "api.json"
}
}
}
Fetching recurring export results
You can check the status of a recurring export using the GET /v3/exports/recurring/<export_id>
endpoint.
curl -X GET \
--url "https://api.meltwater.com/v3/exports/recurring/<export_id>" \
--header "Accept: application/json" \
--header "apikey: **********"
Example response:
{
"recurring_export": {
"updated_at": "2024-04-02T11:34:14.000000",
"tags": [],
"timezone": "Etc/UTC",
"status_reason": "Export run has not completed yet",
"status": "PENDING",
"next_run_date": "2024-04-03T09:30:00Z",
"searches": [
{
"id": <search_id>,
"name": <search_name>
}
],
"sample": {
"count": 1000,
"percentage": 10
},
"inserted_at": "2024-04-02T11:34:14.000000",
"id": <export_id>,
"data_url": <data_url>,
"window_time_unit": "DAY",
"window_size": 7,
"window_time": "09:00:00",
"template": {
"name": "api.json"
}
}
}
Once the status is ACTIVE
there will be results in JSON format at the data_url
. If the status is still PENDING
, the data_url
will still be available, but just contain an empty list of documents.
The first time an export is run for a recurring export the data is available at data_url
. For subsequent runs the data will override previous results at the same data_url
.