Frequently Asked Questions


General

How can I get access to the earned media analytics features of the API?

Please speak to your account manager to check if you package has access to analytics features.


How can I get access to the owned social analytics features of the API?

Please speak to your account manager to check if you package has access to these features.


Do you have an SDK or Client Libraries?

Currently, we do not create our own, though we do provide an Open API specification which allows you to generate your own.


How do filters for my Explore searches work with the API?

If you set any filters in your Saved Search, such as limiting to a country or language, then we preserve those filters in your API requests.

Any additional parameters you supply, such as a country or language for an analytics request will be added will be additionally applied. This means that if your Saved Search has a country filter of “US” and your API request specifies “CA” as a country then you will receive 0 results. This is because we additionally filter your existing Saved Search to also limit country to “CA”. No documents can have a country of “US” and “CA” so no results are returned.


What is the encoding of timestamps?

All dates are encoded using ISO 8601 format.


What is the encoding of strings?

All strings are encoded with UTF-16.


What are my API rate limits?

Please take a look at the Usage Limits page.


Earned media exports

How much data can I export using the API?

Please take a look at the Usage Limits page.


How many exports can I run at one time?

You can create as many exports as you want. However only one export will be processed at a time. If you have scheduled multiple exports at the same time, all outstanding exports will remain in a PENDING state until the first export completes. Once the running export is complete, the next PENDING export will be processed.


How many searches can I use for an earned media export?

For an export request you can specify up to 5 searches or tags.


How long does it take to generate my earned media export?

Generating an export takes anywhere from a few minutes to 30 minutes. The more data involved and the bigger the time period, the longer it will take for an export to complete.

Please take this into consideration when configuring scheduled refresh in your BI tool or any other service consuming the exported data.


When will the data for my recurring earned media exports be refreshed?

Your export will be refreshed 30 minutes after the end of the time period that you have selected for your export. For example if you run a daily export starting at 00:00, then your export will begin to refresh at 00:30.

This 30 minute buffer makes sure that our system has been able to ingest all data for the time period that you have requested.


Does the export window include the “Start Date” and “End Date”?

The export time window begins at, and includes the Start Date, and continues up to, but not including the End Date.

For example: If you want to include all the documents published on December 25th, 2018. You would choose use a Start Date of 2018-12-25T00:00:00 and End Date of 2018-12-26T00:00:00.


What are the different status values for an export?

  • PENDING - Export is waiting to be processed, or is currently being processed.
  • ACTIVE - (recurring exports only) Export has data available and will be refreshed periodically.
  • FINISHED - (one-time export only) Export is complete and data is available.
  • EXPIRED - (one-time export only) Export is older than 60 days, data is no longer available.
  • INCOMPLETE - (one-time export only) Export in finished, but the data is incomplete, you’ll need to re-create the export to get the full set of results.
  • CANCELLED - Export was cancelled by user or administrator. This can happen if the search associated with the export was deleted, or of access to the API has been removed. Data will no longer be accessible for this export.


Do you delete exported data? And when?

For one-time exports, we automatically delete exported data 60 days after the export is executed.

For recurring exports, your exports will be updated on your defined time schedule always providing the newest exported data. If you do not access the data for a recurring export for over 60 days your recurring export will be cancelled.


Why aren't all the keywords I expect in the matched keywords list?

The keywords for document_matched_keywords are collected from the highlights for the matched document. The highlights is a fragment of the document with a certain length. If the document matches a lot of keywords they might exceed the fragment length, and thus won’t be included in the list.


Earned media streaming

How can I verify that the incoming document is coming from Meltwater?

Please see the Document Verification guide.


From which IPs will Meltwater push documents to my data streams? (IP Whitelisting)?

In addition to the document verification that we offer, you can also whitelist the following IPs as an extra level of security, confirming that the messages that are being pushed to your destination (target_url) are coming from Meltwater.

To do so, please whitelist all of the following IPs:

  • 34.247.101.20/32
  • 34.247.65.50/32
  • 34.248.44.159/32


While creating a data stream I am getting an error that my target_url is invalid. What is the expected format?

You can not create the data stream if your target_url:

  • is a malformed url (missing schema, missing host, etc)
  • contains host which is a private IP address
  • contains host which is a loopback address or localhost


I created a data stream but it looks like it disappeared. What happened?

There are two situations in which this can happen:

  • If you create a data stream with a target_url that is not valid, we will delete the data stream.

  • Similarly, if your target_url becomes unavailable and unable to receive content, we will eventually delete the data stream. We will retry 5 times to push data to your target_url, with a 5 minute delay between each retry. If your target_url does not respond with 2xx within a 30 second timeout on any of those requests, we will delete the hook. Effectively this means that if your service behind the target_url is unavailable for a total of 25 minutes, we will delete the hook.

We have implemented this logic, to prevent backpressure on our system from queueing up documents for high volume searches.

What this means for you as the API consumer is that you need to need to choose either of these implementation paths:

Solution A: Build your system such that you can do no-downtime release to the service backing your target_url. That way your target_url is always up, and your hooks will never get deleted.

Solution B: Regularly check that the data stream for the search that you want to stream data for is still active. If not, re-create the hook.


Why do I receive older documents that are not from today?

Through the Streaming API you may receive documents with a document_publish_date that is “older” than the current day. You should expect to receive documents that are up to 31 days old.

There are various reasons why documents can be republished through our Streaming API:

  • The ingestion of data in our system may be delayed. We cannot guarantee that we fetch the original news articles as soon as they have been published by its original source. So we might stream out data to you today, that has actually been published some time in the past (e.g. the day before).
  • Articles can get updated. You might receive a document with a given document_publish_date and then receive an updated document from the same article with an older document_publish_date. This can happen if a publisher decide to modify and republish a news article.
  • Major changes to our list of news and social sources can also retrigger documents publishing: Modified sources are reprocessed and articles are refetched. Potentially, we even pick up articles we have not collected before. That way we ensure that we always provide the richest data set of articles for customers.

How to handle such older articles?

How to handle such older articles in your integration with our API (your API client) depends on your business case.

A possible approach would be to simply ignore a document, identified by its document_id, if you have already received this document before. The downside of this approach is that you might miss updates on the document caused by e.g. new features we have built.

Our general recommendation is following:

  • Expect every document to contain valuable updates, thus, store (or reprocess) the document in your client.
  • In case you are storing the document we recommend to overwrite an existing document with the same document_id. Worst case would be that you would simply write the same document over and over again. Potentially, though, the more recently received document will contain updated and relevant information.


What happens if I go over my monthly document limit?

As per your contract, you are entitled to retrieve a set number of documents per month through the API.

If you go over this limit:

  • Don’t worry, you will continue to receive documents - the service will not be interrupted.
  • Your Meltwater Sales representative will contact you, to discuss possible upgrade options for you, or ways to limit the amount of data that you are requesting.


What is the maximum amount of data streams I can create?

Please take a look at the Usage Limits page.


Will I be notified if a data stream is deleted due to an unavailable target_url?

Yes, if a data stream is deleted because the target_url becomes unavailable, we will notify you through two channels:

  • Email Notification: We will send an email to the registered account email address, detailing the issue and the specific data stream that has been affected.
  • In-App Notification System: Additionally, a notification will be sent to your account on our application.