Configuring a Data Source for OData Access - Forum - DataDirect Cloud - Progress Community

Configuring a Data Source for OData Access

 Forum

Configuring a Data Source for OData Access

  • DataDirect Cloud now provides support for OData.  OData though DataDirect Cloud provides a REST Web Services method for accessing data sources in the cloud and behind your firewall.

    A DataDirect Cloud data source needs to be enabled for OData Access.  Enabling a data source involves specifying what schema in the data source is to be exposed via OData and optionally what tables are to be exposed or excluded.  This is done by specifying a value for the ODataSchemaMapping data source option.  The details for specifying this option can be found at

    http://documentation.progress.com/output/DataDirect/DataDirectCloud/index.html#page/OData/DataDirectCloud.04.html#

     

    Refresh the OData Data Model

    The first time a data source is accessed using an OData query, the DataDirect Cloud service will generate the OData Data Model based on the ODataSchemaMapping specified.  Once the data model is created it is persisted and will be used for subsequent OData queries against that data source.  If changes are made to the ODataSchemaMapping or there are new tables added to the back end data source, the OData data model must be refreshed in order to pick up the changes to the mapping or back end data source.  In the current release of DataDirect Cloud the only way to refresh the OData data model is to issue rest calls against the DataDirect cloud management API.  A soon to be released update to DataDirect Cloud will provide a GUI editor for generating the data model mapping and will also provide a button for refreshing the model from the WebUI.  In the mean time you need to use the Management API to

    • Get the id of the DataSource that you want to refresh
    • Send a data model refresh request using the data source id

    This can be done either programmatically or via an HTTP tool such as Curl or Postman

    To get the id of the DataSource, issue an HTTP GET request to the following URL

    https://service.datadirectcloud.com/api/mgmt/datasources


    This request uses HTTP Basic Authentication.  Enter your DataDirect Cloud User Id and Password as the User Id and Password for the Basic Authentication.

    The response to this request will be the list of datasources defined for the user id used for authentication.  Find the datasource for which you want to refresh the OData data model.  The data source entry will look something like

    {
        "id": "1234",
        "name": "SQLServerOPTest",
        "dataStore": 46,
        "description": ""
    },

    In this entry, the data source name is SQLServerOPTest and the id for the data source is 1234.  Now that we have the data source id the model refresh request can be sent.  The URL for sending the refresh request has the form

    https://service.datadirectcloud.com/api/mgmt/datasources/<datasource-id>/map

    So to refresh the model for the SQLServerOPTest data source above submit an HTTP POST request to

    https://service.datadirectcloud.com/api/mgmt/datasources/1234/map

    the body of the POST must be set to the following JSON string

    {
        "model":"refresh"
    }

    This will refresh the data model.  The next OData query submitted to the datasource will cause the data model to be created again.

    Disabling OData Session Caching

    DataDirect Cloud includes another performance optimization for OData where the connection to the backend data source is cached in a session associated with the data source being queried.  Caching the back end connection improves performance when multiple OData queries are submitted to the same data source because the connection does not need to be created on every query.  It is established on the first query and saved in the session and reused in subsequent queries.  While useful in a production mode for improving performance, the caching of the back end connection can get in the way when trying to configure a data source for OData.  If a change is made to any of the DataDirect Cloud data source properties including to the ODataSchemaMapping, those changes will not be seen because the connection (established using the old data source definition) has been cached and will be reused by subsequent OData queries to the data source.  The session that caches the backend connection will be discarded if there is no activity to the data source for a period of approximately 5 minutes.

    When configuring a data source for OData it is recommended that the OData session caching be disabled.  This is done by setting the OData Data Source Caching data source property to zero.  Once you are happy with the OData configuration for the data source set the OData Data Source Caching property back to 1 to get the performance improvement provided by caching the connection to the backend data source.

  • SQL Server Integration Services released an OData connector this year: www.mattmasson.com/.../odata-source-for-ssis. We're using this in conjunction with Progress DataDirect to pull records from Salesforce. It works really well. The one caution is that the record limit seems to be about 2,000 records if you set the Caching property to zero (otherwise it throws an error). Good advice on that Caching property!

  • Thanks for the feedback.  Regarding the 2000 records coming back, the OData service in DataDirect will return large results in pages.  By default the page size is 2000 entities.  At the end of the page of results will be a Next link that if followed will return the next page of the results.  I wonder if the next link is not being followed?  There is an option OData Page Size that can be used to change the number of entities returned in a page.  If OData Page Size is set to 5000 do you now see 5000 records coming back?

  • I will test this sometime. At present, however, it appears as if the OData source consumes a significantly higher proportion of the Data Transfer limit than does querying Salesforce data using the ODBC driver. Not sure why this is, exactly - is DataDirect Cloud including non-value data (the "string" characters labeling the data field, along with other formatting that surrounds the actual data one is interested in pulling) in its "tally" of the data transfer limit? If so, it's difficult to imagine anyone choosing to use the OData source over using the ODBC connector, as staying within the limitations of the data transfer limit is probably something fairly important to most who use DataDirect Cloud.

  • Matt, you are correct.  The ODBC and JDBC APIs are more efficient at how they transfer data.  That is because with ODBC and JDBC we are using a binary protocol that we designed to be as efficient as possible to transfer data.  We can do that because we control both the client and the server components.  With OData, the responses need to comply with the standard.  These responses are either JSON or XML, both of which are text based.  This means that the value data must be converted to it's string representation before it is transferred.  Depending on the data being queried this can lead to an expansion in the number of bytes needed to transfer the data.  For example binary data is hexified leading to a 2x expansion in the data.  Added to that is the non-value data needed for the JSON or XML representation.  

    We are including the total size of the response (the expanded value data and the non-value data) in what is charged against the transfer limit.  That is something that we could look into if that is a significant issue.