Jump to content
OpenSplice DDS Forum

Hans van 't Hag

  • Content Count

  • Joined

  • Last visited

About Hans van 't Hag

  • Rank
    Product Manager

Contact Methods

  • Website URL

Profile Information

  • Gender
  • Location
    Hengelo, The Netherlands
  • Company
    ADLINK Technology

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Hi, Thanks for explaining. I have a few remarks/questions I see you're using TRANSIENT durability, which is typically exploited when using 'federated-deployment' where federations have a configured durability-service that maintains non-volatile data for late-joiners. In case you're using standalone-deployment (aka 'single-process), which is the only option when using the community edition, you can still use TRANSIENT data but that then relies on applications being active that have configured a durability-service (in their 'ospl' configuration) and where the historical data retained by an application is 'tied' to the lifecycle of that application (so when all apps are 'gone', there won't be any retained historical data). Another consequence is that each application potentially retains a copy of ALL historical data, whether its interested in it or not. You might want to consider using TRANSIENT_LOCAL data which is more suitable for standalone deployment as that data is (solely) maintained by the producer (writer) of that TRANSIENT_LOCAL data. Note that the amount of TRANSIENT_LOCAL data retained by writers is (like for TRANSIENT data ) driven by the topic-level durability_service settings I see that you don't configure the durability_service QoS-settings on the topic which means that defaults will apply i.e. a KEEP_LAST/1 policy (so 1 historical sample per instance will be retained by durability-services in case of TRANSIENT data or by TRANSIENT_LOCAL writers in case of TRANSIENT_LOCAL data. I agree that this might not sound intuitive, but as non-volatile/durable data needs to be (resource-)controlled, the topic-level durability-service QoS-policies (kind/depth & resource-limits) are used to do that (for both TRANSIENT and TRANSIENT_LOCAL data behavior) I see that you distinguish between 'late-joiners' and 'restarted-apps' which is somewhat different from what is typically assumed where (perhaps even especially) a crashed/restarted app is considered (also) a late-joiner (to regain its state from before the crash/restart). If it would be possible for these app's to detect whether they are restarted or first-started (what you call a late-joiner) you might consider using a 'trick' where you create a reader as volatile (so it won't receive any historical/durable data), but when necessary (i.e. if it is a 'true late-joiner' as you define it), explicitly call 'wait_for_historical_data()' to retrieve historical data 'anyhow' (there's a timeout you can exploit for how long you'd want to wait for that). We had some specific use-cases in the past where for a specific topic there where both periodic writers (for which no durability is required as updates come in regularly) but also one-shot writers (who's one-shot update wasn't allowed to 'get lost') for which support for this pattern was introduced.
  2. GUID's are automatically/internally generated so are not meant to be manually provided (its surely not part of the DDS-API that you'd want to program against) I'm curious however what problem you're facing (which apparently is related to 'repeated messages') .. could you elaborate on that a little ?
  3. Couple of notes: when threads are reported to make no progress, thats often caused by an overloaded system when watermarks are reported to be reached that's often an indication that data couldn't be delivered when d_namespaceRequests issues are reported, there's an issue with durability (probably i.c.w. the above) So I have a few questions: are you using TRANSIENT and/or PERSISTENT topics and if so, please note that those imply running durability-services which are typically part of federations are you using the community-edition, as that edition does not support federations (with configured durability-services) if you're using the community-edition, by default each application is configured with a durability-service-thread but as app's come and go, you can't be sure of transient/persistent data remaining available when app's are gone so when using the community-edition, I'd strongly suggest to use TRANSIENT_LOCAL topics for non-volatile data as that is a simpler mechanism where the data is retained at the writer and finally if you're using the commercially supported version, you can ask such questions and/or file bug-reports directly to our helpdesk and support-portal
  4. Hi Luca, Receiving transient-local data from a destroyed data would be a true miracle (as that data is solely maintained at that writer). Are you sure that there are no other writers alive in the system who's data you're receiving ? The only other possibility would have been if your data was TRANSIENT instead TRANSIENT_LOCAL and there would be other app's alive that have an 'embedded' durability-service (as the community-edition doesn't support federated-deployment where such a durability-service would be part of a federation which doesn't necessarily need to include any applications) W.r.t. the 'normal' versus topic-level-durability-service-history-levels, for TRANSIENT and TRANSIENT_LOCAL, the topic-level durability-service history/resource-limits settings actually drive how much historical data is preserved for late-joiners. The 'normal' history-settings aren't about durability but determine the behavior of writer- and reader-caches: for a writer, a KEEP_LAST history would imply that when writing data faster than the system (typically the network) can handle, old data will be overwritten with fresh data even before its transmitted and for a reader, a KEEP_LAST history means that when a reader can't keep-up with the flow of arriving data, the data in its history-cache will be overwritten with fresh data so that 'at least' the most recent data is available for consumption. Note that this 'overwrite-behavior' happens for each instance individually (i.e. a history-depth applies to the history-size for each instance). Using a KEEP_ALL policy (at writer and reader) implies flow-control and will (eventually i.e. after queueing resources are used-up) cause end-to-end flow-control where a slow reader determines the speed at which a writer can publish samples (and therefore should be handled-with-care i.e. used only for 'event-kind' of data where its important that all samples are delivered and consumed in order, which is different from typical telemetry or even state-kind of data where the most recent data is what is typically required, and where this downsampling is thus allowed and can be even considered a feature as it allows to maintain the decoupling between autonomous applications. Hope this helps a little -Hans
  5. I don't think you have to set the service-cleanup-delay. W.r.t. the history, when using KEEP_ALL (for the durability-service QoS) you also should set the resource-limits as otherwise its likely that you'll run out of memory.
  6. Wifi is notoriously unreliable when it comes to multicast. If you have an excellent connection that's not an issue but typically the advantages of using multicast (send-once efficiency) are outweighed by the retransmissions required due to massive data-loss when using multicast). I'm not sure however if that would impact your disconnect/reconnect issues .. but at least it's good to know I guess
  7. In steady state (i.e a writer isn't writing samples), a late-joiner will receive not more 'durable' (i.e. non-volatile) samples (of instances) than whats defined in the durability-qos settings for the durability-service as configured via the topic-qos policy. Those settings are max-samples (for all instances), max-samples-per-instance and/or max-instances. W.r.t. where these samples are 'stored' depends on the QoS. When using TRANSIENT_LOCAL durability, those samples are stored 'at the writer' (so are gone when the writer terminates), if the durability QoS is set to TRANSIENT (or PERSISTENT), those samples are maintained in 'durability-services'. Now how many and where those durability-services reside depends on the deployment mode which is either 'standalone' (which is the only option for the community-edition) or 'federated' (which is only available in the commercially supported version) in which case each federation will typically have a durability-service configured and which then align-themselves so to assure there's multiple copies available to provide late-joiners with historical data. W.r.t. the reliability-over-wan, it makes sense to NOT use multicast for the data-flows when exploiting Wifi (for discovery its fine). This can be accomplished by changing the xml-configuration-file and changing the default setting of 'Allowmulticast' from 'true' to 'spdp' which implies that multicast is ONLY used for the discovery-phase but not for the actual data-flows i.e. when there are multiple recipients, each one will be provided by a unicast-stream, something that works often better than multicast over WIFI. <AllowMulticast>spdp</AllowMulticast>
  8. I think you need to distinguish between reliability and durability. Reliability is about the guarantee that 'in steady state' (as in non-steady-state, old samples in the writer-history might already be overwritten by new ones, depending if you y/n use a KEEP_ALL history-policy at the writer-side) the writer-history will be 'eventually' replicated (that is 'delivered') to the reader-history (where it of course can push-out samples from that reader's history, depending on the history-policy of that reader). For short disconnections/reconnections, the reliability-protocol should recover from message-loss i.e. retransmit those messages that got lost during the disconnect. Yet I suspect here we're talking about multi-second disconnections which likely implies that the reader needs to be re-discovered after the connection re-establishes (and similarly on 'the other side' i.e. the reader re-discovering the writer) .. It could be that deploying a TRANSIENT_LOCAL durability-QoS is helpful in these cases as upon reconnection and re-discovery, the reader would be considered a 'late-joiner' and therefore woud be provided with the historical data kept at the writer-side (you can configure the amount i.e. 'depth' of that history data using the durability-service QoS-settings on the topic-level)
  9. hmm .. then I don't know whats happening right away .. I'd suggest to raise a ticket with support (preferably with some example-code and used-configs to reproduce the error)
  10. Hi, The xml-config shows that you're using a federated deployment (shared-memory) thats implies that you're using the commercially-supported version (as the community-edition only supports 'standalone' i.e. 'single-process' deployment). So from that follows that you have a commercial subscription so can (also) raise a support-ticket for questions/bugs etc. Now back to your issue: the ospl-error.log file suggests that the issue could be that you didn't start the federation (i.e. using 'ospl start') before starting the application. Hope this helps, Regards -Hans
  11. Can you share the ospl-info and - error logfiles?
  12. Guys, For years we've been running this forum parallel to the GitHub repository: https://github.com/ADLINK-IST/opensplice We concluded that its more efficient to concentrate on 1 environment and therefore would kindly ask you to direct any remarks/questions to Github. Thanks, -Hans
  13. Guys, For years we've been running this forum parallel to the GitHub repository: https://github.com/ADLINK-IST/opensplice We concluded that its more efficient to concentrate on 1 environment and therefore would kindly ask you to direct any remarks/questions to Github. Thanks, -Hans
  14. Guys, For years we've been running this forum parallel to the GitHub repository: https://github.com/ADLINK-IST/opensplice We concluded that its more efficient to concentrate on 1 environment and therefore would kindly ask you to direct any remarks/questions to Github. Thanks, -Hans
  15. Hi, While (in the past) the X-types spec was still under development and we already had customers that asked for standards-based type-extensibility, we opted to include support for Google procol-buffers in OpenSplice, so that's perhaps an alternative that you might explore. Here's the tutorial for that (as bundled in the community-editi on's doc-section too) : OpenSplice_GPBTutorial.pdf
  • Create New...