Jump to content
OpenSplice DDS Forum


  • Content Count

  • Joined

  • Last visited

About namruuh

  • Rank
    DDS Expert
  • Birthday 05/05/1987

Profile Information

  • Gender
  • Company
  1. There are a few things you can try/verify: - Are you running the same version of OpenSpliceDDS on both machines? - Are you running the same pingpong versions on both machines (maybe you edited the topic model of one of the versions, then they would communicate) - Are there any info log or error logs on either machine to give hints on what is going on - Are the networking and spliced processes still running on both machines after you see those messages (the ping timeout messages indicating no communication), maybe something went wrong and crashed for some reason. - Can the XP and Linux machines ping eachother, if not then this needs to be solved first - Try running wireshark, do you see the packets of one machine arriving on the other and vice versa. Those are the first few things you can take a look at to get a better idea of what exactly is going on and help analyse the situation.
  2. Hey, Just a quick question, do you use the wait_for_historical_data call to ensure the node with the datareaders is fully aligned with all transient data available? If not, look that call up and try it out.
  3. Hey, Yeah there are definitely other locations to check out. You should check out what the variable TEMP and TMP in your environment have as values. I think the default value is something like '%USERPROFILE%\Local Settings\Temp'. Also in your application data folder in your userprofile there might be a directory PrismTech, I am unsure of the location on windows 7, but it could be that log files are directed towards a location there. It might also be that the 5.7.1 installation did not work due to lingering environment variables being set of the old installation, so that is something to look into as well. If you have no luck with this I can try to install 5.7.1 on one of the windows 7 machines, to see the locations used. Also the 5.7.1 version used is an evaluation version, or are you a commercial user? Regards, Emiel
  4. Hey, A quick tip for you to try, I remember fixing a bug which had an issue if the path pointed towards by the OSPL_TEMP environment variable contained directories that do not exist upon start time. So try changing the value of that variable to c:\temp for example instead of the default of c:\temp\ospl. And see if that yields better results. This bug has been fixed in the 5.7.1 version (already available to commercial users I believe) of OpenSpliceDDS, which also contains some robustness improvements when OpenSpliceDDS runs as a service (it better facilitates automated restarting capabilities for example). Hope this helps, Emiel
  5. Hi Steve, I am glad to hear you are now able to run your program properly. In principle the deletion of the datawriter should indeed result in cleaning all shared memory, just as the unregister call would. So it is interesting to us why it is leaking memory there. If you want, it would be helpful if you would submit a bugzilla report describing the issue so that in due time an engineer here can look at what is going on and in that way resolve a potential memory leak. For our process on bug reporting, please see this link: http://opensplice.org/cgi-bin/twiki/view/Community/BugReporting Regards, Emiel
  6. Take a look at DataWriter/DataReader QoS at the reliability element for an setting there called synchronous (not 100% sure of exact naming ) The reference manual should give more information on what it does, which you can use to analyze if it is applicable to your specific usecase.
  7. Hi, Based on the description you provided along (specifically that it occurs sometimes after 1 hour and sometimes after 3 days) with the stacktraces it seems some sort of memory corruption going on. And that is hard to analysis without a test program. There are a few steps you can do to see in more detail what is going on. Either you submit a bugzilla report along with a (stripped) test program so that we can analyse ourselves (when that is done depends on priorities) or you could rebuild OpenSpliceDDS yourself with debug symbols included or even better a development version (with assertions and such available as well). The latter however can sometimes hide errors though due to changed timings and such, with debug symbols only there is less risk of that and the error might be easier to reproduce. Once you have one of these two builds you should reproduce the error and the stacktrace you'll get will have a lot more information available, which can tell a bit more context about what is going on and why this crash happens. Perhaps you can then debug it yourself or post more information here so we can see if we can help you forward like that. Regards, Emiel
  8. What is the topic name you are using? It's my guess you are using an invalid topic name (the set of allowed characters is listed in one of the appendixes of the reference manual). The error you are getting says 'syntax error near - at line: 1, column: 19'. So it seems you are using a '-' character in your topic name. I do believe that is a character which is not allowed.
  9. Hey Steve, The usage you describe should work fine, so there is something specific to the settings you have used which cause the issue. Basically your use case involves writing a message with a specific unique key, which is read by a subscriber and then it is no longer of interest. So once your writer has written the message, it can immediately be removed from it's administration (as long as the subscriber reads it). Now for this purpose you have used the autodispose unregistered instances = True setting. This setting ensures that when you unregister a message (a call on the writer) that a dispose message is also produced. Now it is the question if you want the dispose message, from my understanding of your use case I would say you do not want this as it is unneeded information. You just want to unregister the instance so it will be cleaned up everywhere (and since you are not using a transient store, we do not need a dispose to accomplish this goal). So I'd recommend not using the autodispose setting (will just create more traffic) and use the unregister_instance call on the datawriter. Then you have everything covered on the datawriter side. On the subscriber(reader) side you need to also perform a step to ensure memory is cleaned up. You need to ensure you use the 'take' and not the 'read' calls on the datawriter. With the 'take' calls you remove the instance from the database, with the read call you do not. So a take call on the reader and a unregister instance on the writer (note: the register instance is implicit by the write call) will ensure your resources are properly cleaned up. Now one final thing, the amount of shared memory you should configure should be related to how many instances will exist in your system at the same time. Logically there should be enough shared memory to hold all data, the (very basically) the only thing you need to use to dimension your shared memory size with (not forgetting about added resource usage by opensplice). And also it's good to note that the shared memory manager in opensplice 5.4 fragments memory a bit more then the new and improved version in the soon to be released 5.7.1 version of OpenSpliceDDS.
  10. Hey Jim, Meant to have replied earlier, but it must have slipped my mind. All in all what you are doing should work fine, so it indeed seems a problem related to memory. I was going to suggest to submit a bugzilla report, so great that you have done so. Future updates will go through that ticket.
  11. We have had customers with issues like these before in the past and it was always resolved successfully. But the solution is very specific on what is going on exactly, sometimes the problem originates from some custom network interface driver, sometimes it is the switch which drop packets (and doesn't nicely report it at times too!), and sometimes its solved by doing some specific configuration of OpenSpliceDDS. Hope you resolve it fast!
  12. There is a lot of things to configure regarding OpenSpliceDDS that can help in that area. But first you would need to know what exactly is the problem and have it fully analyzed before you can change configuration parameters. I can suggest changes, but I do not think that this is fruitful without first understanding the problem completely. So I would recommend doing more analysis on what is causing the seen problems. The default values are the default for a reason, and you should only deviate from them with a sound reason, generally speaking Like I said the CPU load are a piece of the information that is needed, but it could be that after that more logging of networking is needed to analyze the situation. So analyze and gather as much information on the various aspects and then see if you can reach a conclusion on what is going on. You'll have to do most of that yourself, as it's hard for me to give complete answers of the forum. If you have a support contract, you can always contact PrismTech support for a more 1 on 1 approach to resolving the issues. In general the scenario you described should be fine and not cause issues at all. And since you reported that just a few nodes had this problem, it really seems to point to those nodes being too busy at times for example. But cant know for sure
  13. Hey, It could be a problem with the switch you are using, we have seen that before. But there are some things to investigate though. Can you see the CPU loads on the individual machines specified per thread. Maybe some thread is really busy, if no thread is near 100% load or anything, I'd guess the switch you're using might be a more likely suspect. You should also post the XML config file you are using, maybe some configuration settings can explain the behavior.
  14. Hi slewman, If you use a '_var' type instead of a '_ptr' type to store the newly created typeSupport in the clean up will be automatic. As var types will be cleaned automatically when it runs out of scope. If you do want the use the _ptr types then I would suggest you read section 1.3 'Memory Management' of the C++ reference manual delivered with the product. Hope that helps!
  15. Hi Tom, Could you indicate which documentation you are referring too? It would help to know that context. I also wonder where the interest for the zero copy comes from, is it just a verification of how something works or are you looking to achieve certain performance levels. With the latter it is more important if the desired performance is achieved rather then how (talking about specific implementation) it is achieved, roughly speaking. Especially when the implementation choice does not bear any negative functional issues.
  • Create New...