ADLINK-IST/opensplice

Problems when trying OpenSplice on Vxworks 6.9

aohwang opened this issue · 6 comments

Hi friends,

I am a newbie about OpenSplice. I am trying the OpenSplice on Vxworks 6.9, but I meet some problems.

I cross compiled the OpenSplice for the Vxworks 6.9 and compiled the Pingpong example for test. There are two modes for Vxworks and what I chose is kernel mode, so the Pingpong example is compiled as downloadable kernel module according to the documentation.

And my OpenSplice configuration is shown below.

<OpenSplice>
      <Domain>
      <Name>ospl_shmem_ddsi</Name>
      <Id>0</Id>
      <Description>Federated deployment using shared-memory and standard DDSI networking.</Description>
      <Database>
         <Size>10485760</Size>
      </Database>
      <Service name="ddsi2">
         <Command>ddsi2</Command>
      </Service>
      <Service name="durability">
         <Command>durability</Command>
      </Service>
      <Service name="cmsoap">
         <Command>cmsoap</Command>
      </Service>
   </Domain>
   <DDSI2Service name="ddsi2">
      <General>
         <NetworkInterfaceAddress>AUTO</NetworkInterfaceAddress>
         <AllowMulticast>true</AllowMulticast>
         <EnableMulticastLoopback>true</EnableMulticastLoopback>
         <CoexistWithNativeNetworking>false</CoexistWithNativeNetworking>
      </General>
      <Compatibility>
         <!-- see the release notes and/or the OpenSplice configurator on DDSI interoperability -->
         <StandardsConformance>lax</StandardsConformance>
         <!-- the following one is necessary only for TwinOaks CoreDX DDS compatibility -->
         <!-- <ExplicitlyPublishQosSetToDefault>true</ExplicitlyPublishQosSetToDefault> -->
      </Compatibility>
   </DDSI2Service>
   <DurabilityService name="durability">
      <ClientDurability enabled="true" />
      <Network>
         <Alignment>
            <TimeAlignment>false</TimeAlignment>
            <RequestCombinePeriod>
               <Initial>2.5</Initial>
               <Operational>0.1</Operational>
            </RequestCombinePeriod>
         </Alignment>
         <WaitForAttachment maxWaitCount="100">
            <ServiceName>ddsi2</ServiceName>
         </WaitForAttachment>
      </Network>
      <NameSpaces>
         <NameSpace name="defaultNamespace">
            <Partition>*</Partition>
         </NameSpace>
         <Policy alignee="Initial" aligner="true" durability="Durable" nameSpace="defaultNamespace" />
      </NameSpaces>
   </DurabilityService>
   <TunerService name="cmsoap">
      <Server>
         <PortNr>Auto</PortNr>
      </Server>
   </TunerService>
</OpenSplice>

After the "ospl_spliced" is started, the download kernel module task hangs. The terminal outputs string below.

memPartFree: invalid block 0x9b999f8 in partition 0x381150

At the same time, you can see the stack trace in the debugger below.

tDds_rjb: 0x9109670(stopped - User request)
Stack trace:
    _taskSuspend() - 0x322d04
    memPartFreeInternal() - 0x24409c
    free() - 0x24471c
    q_parser_parse() - 0x322d04
    q_parse() - q_parser.y:311
    v_resolvePartitions() - v_kernel.c:1570
    v_partitionNew() - v_partition.c:106
    v_partitionAdminFill() - v_partitionAdmin.c:179
    v_publisherEnable() - v_publisher.c:245
    v__entityEnable()[Inline] - v_entity.c:262
    v_entityEnable() - v_entity.c:331
    v_publisherNew() - v_publisher.c:203
    v_builtinNew() - v_builtin.c:820
    v_kernelEnable() - v_kernel.c:1238
    v_entityEnable() - v_entity.c:288
    v_entityEnable() - 0x8f5d5c4
    u_domainNew() - u_domain.c:1121
    u_splicedNew() - u_spliced.c:112
    ospl_spliced_unique_main() - spliced.c:1478
    ospl_spliced_vx_unique_main() - mainWrapper.c:32
    ospl_spliced() - mainWrapper.c:21

Could you give me some suggestions about why the kernel task is suspended? What's wrong?

Another less important thing I want point out is what I didn't put "ospl_metaconfig.xml" on the Vxworks filesystem, so there is an error message you can see below. But I think it doesn't matter.

Report          :  ERROR
Date              :  THU JAN 01 00:12:54 1970
Description  :  Failed to open meta configuration file "ospl_metaconfig.xml". The file was not found in the current directory nor could the file be found at the default location beacause the environment variable OSPL_HOME was not set

Thanks

It seems that there is a memory corruption, but I don't do anything. I just start the spliced dameon.

Dear Aoh Wang,

You have not set the environment variable, hence it gives an error "* the environment variable OSPL_HOME was not set".
Before starting your application you need to set all the environment variables by sourcing release.com in linux or release.bat in windows. Keep in mind that when you source the release.com/release.bat, environment variable set only for the local terminal so you have to run your application from the same terminal.

You can also make the the environment variable global. for this you can refer #128

With best regards,
Vivek Pandey

Hi vivekpandey02,

I have solved the environment problem mentioned above.

Report          :  ERROR
Date              :  THU JAN 01 00:12:54 1970
Description  :  Failed to open meta configuration file "ospl_metaconfig.xml". The file was not found in the current directory nor could the file be found at the default location beacause the environment variable OSPL_HOME was not set

For now, the problem is the memory corruption.

memPartFree: invalid block 0x9b999f8 in partition 0x381150

Could you give some suggestions?

Could you please print your OSPL_HOME and OSPL_URI variables to check your environment variable set correctly? and one more thing i want to add, in our community edition, we only support single process but you are using shared memory configuration.

With best regards,
Vivek

First, I set the environment variables correctly, and I think it doesn't matter with environments because it's a downloadable kernel module in Vxworks 6.9.

Second, as I mentioned it's a memory corruption problem.

memPartFree: invalid block 0x9b999f8 in partition 0x381150

I think I have found the root cause, it's a bug in the q_parser.y file. I will make a pull request.