segmentation violation
speedbooster opened this issue · 24 comments
Hi, I have modified the gcd_nontd_test.tcl test script to point to the libraries I am using and a DEF design output from QFlow. I am getting the following on rep init_replace line:
** WARNING: Your Detail Placement Step must be skipped.
(i.e. this program will be executed as -onlyGP)
If you want to have DP after GP, please specify -dpflag and -dploc.
[INFO] TargetDensity = 1.000000
[INFO] ExperimentIndex = 27
[INFO] DirectoryPath = /home/[user]/Desktop/usb/scratch/etc/live/experiment027
child killed: segmentation violation
Is it possible to use the flow at mature nodes? like 0.6u, 0.35u or 0.18u? I am attempting with X Fab.
To solve the error, could you give your input files and Tcl scripts for me?
Do you have access to XFab's kits on your end? They're confidential, and I am under NDA.
I am using XC06's digital cells library with 3-metal option (not thick).
Sorry, I dont' have.
Could you give me a log when you type "$ valgrind ./replace < your_script.tcl"?
Yes, sure, I was looking for a method to log the script. Thank you. I'll post back INSHAALLAH
Does RePlAce (and other tools) produce a log file when working? It'd help...
The script:
#
# Examples for Non Timing-driven RePlAce with TCL usage
#
set design live
set lib_dir /home/[user]/pdk
set design_dir /home/[user]/Desktop/usb
set lef_path ${design_dir}/scratch/pdk.lef
set def_path ${design_dir}/layout/${design}.def
replace_external rep
# Import LEF/DEF files
rep import_lef $lef_path
rep import_def $def_path
rep set_output $design_dir/scratch
puts $def_path
rep set_verbose_level 0
# Initialize RePlAce
rep init_replace
# place_cell with BiCGSTAB
#rep place_cell_init_place
# print out instances' x/y coordinates
#rep print_instances
# place_cell with Nesterov method
#rep place_cell_nesterov_place
# print out instances' x/y coordinates
#rep print_instances
# Export DEF file
#rep export_def ${design_dir}/scratch/{$design}_nontd.def
puts "Final HPWL: [rep get_hpwl]"
Previously I was typing replace in console, then replace < [script_name].tcl in the replace tcl console, which gave me previous output.
This time I typed valgrind replace < [script_name].tcl
==16894== Memcheck, a memory error detector
==16894== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16894== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==16894== Command: replace
==16894==
==16894== Invalid read of size 1
==16894== at 0x5D2B4A: GetTokenFromStack (lef_keywords.cpp:345)
==16894== by 0x5D2B4A: LefDefParser::GetToken(char**, int*) [clone .cold.152] (lef_keywords.cpp:405)
==16894== by 0x95F0A5: LefDefParser::lefyylex() (lef_keywords.cpp:657)
==16894== by 0x973D01: LefDefParser::lefyyparse() (lef.y:579)
==16894== by 0x6396E8: Replace::Circuit::ParseLef(std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >&, bool) (lefParser.cpp:3218)
==16894== by 0x62AA81: Init (lefdefIO.h:134)
==16894== by 0x62AA81: ParseLefDef() (lefdefIO.cpp:403)
==16894== by 0x62ACC6: ParseInput() (lefdefIO.cpp:241)
==16894== by 0x6C4604: replace_external::init_replace() (replace_external.cpp:325)
==16894== by 0x6CB065: _wrap_replace_external_init_replace (replace_wrap.cpp:3240)
==16894== by 0x6C8E1E: SWIG_Tcl_MethodCommand (replace_wrap.cpp:1329)
==16894== by 0x5C4AEB1: ??? (in /usr/lib64/libtcl8.5.so)
==16894== by 0x5C8F36B: ??? (in /usr/lib64/libtcl8.5.so)
==16894== by 0x5C97646: ??? (in /usr/lib64/libtcl8.5.so)
==16894== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==16894==
==16894==
==16894== Process terminating with default action of signal 11 (SIGSEGV)
==16894== Access not within mapped region at address 0x0
==16894== at 0x5D2B4A: GetTokenFromStack (lef_keywords.cpp:345)
==16894== by 0x5D2B4A: LefDefParser::GetToken(char**, int*) [clone .cold.152] (lef_keywords.cpp:405)
==16894== by 0x95F0A5: LefDefParser::lefyylex() (lef_keywords.cpp:657)
==16894== by 0x973D01: LefDefParser::lefyyparse() (lef.y:579)
==16894== by 0x6396E8: Replace::Circuit::ParseLef(std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >&, bool) (lefParser.cpp:3218)
==16894== by 0x62AA81: Init (lefdefIO.h:134)
==16894== by 0x62AA81: ParseLefDef() (lefdefIO.cpp:403)
==16894== by 0x62ACC6: ParseInput() (lefdefIO.cpp:241)
==16894== by 0x6C4604: replace_external::init_replace() (replace_external.cpp:325)
==16894== by 0x6CB065: _wrap_replace_external_init_replace (replace_wrap.cpp:3240)
==16894== by 0x6C8E1E: SWIG_Tcl_MethodCommand (replace_wrap.cpp:1329)
==16894== by 0x5C4AEB1: ??? (in /usr/lib64/libtcl8.5.so)
==16894== by 0x5C8F36B: ??? (in /usr/lib64/libtcl8.5.so)
==16894== by 0x5C97646: ??? (in /usr/lib64/libtcl8.5.so)
==16894== If you believe this happened as a result of a stack
==16894== overflow in your program's main thread (unlikely but
==16894== possible), you can try to increase the size of the
==16894== main thread stack using the --main-stacksize= flag.
==16894== The main thread stack size used in this run was 8388608.
==16894==
==16894== HEAP SUMMARY:
==16894== in use at exit: 999,096 bytes in 896 blocks
==16894== total heap usage: 1,084 allocs, 188 frees, 1,458,632 bytes allocated
==16894==
==16894== LEAK SUMMARY:
==16894== definitely lost: 0 bytes in 0 blocks
==16894== indirectly lost: 0 bytes in 0 blocks
==16894== possibly lost: 788,777 bytes in 45 blocks
==16894== still reachable: 210,319 bytes in 851 blocks
==16894== suppressed: 0 bytes in 0 blocks
==16894== Rerun with --leak-check=full to see details of leaked memory
==16894==
==16894== For counts of detected and suppressed errors, rerun with: -v
==16894== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
I got the exact same error when I use CentOS 8/gcc8.
Unfortunately, I still have no clue why this happens.
Could you use Docker environment, instead?
Yes, I do have gcc 8.3.0. Machine is CentOS 7.
What is your development platform? I am now compiling using gcc 4.8.5
If you can compile again with gcc-4.8.5/CentOS7, then the new binary would not have this problem.
I think this issue is due to the compiler's different memory handling on copy constructor/malloc...
We're going to integrate OpenDB with RePlAce, so this error would be removed in the near future.
(My modified LEF/DEF parsers seem to have a compiler-dependent problem...)
Here, compiled using 4.8.5:
==21032== Memcheck, a memory error detector
==21032== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==21032== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==21032== Command: replace
==21032==
RePlAce Version: 1.0.0
** WARNING: Your Detail Placement Step must be skipped.
(i.e. this program will be executed as -onlyGP)
If you want to have DP after GP, please specify -dpflag and -dploc.
[INFO] TargetDensity = 1.000000
[INFO] ExperimentIndex = 34
[INFO] DirectoryPath = /home/[user]/Desktop/usb/scratch/etc/live/experiment034
[INFO] DefUnit = 100
[INFO] LefMetal1Name = METAL1
==21032== Invalid read of size 8
==21032== at 0x8BF6E0: LefDefParser::defiRow::macro() const (defiRowTrack.cpp:364)
==21032== by 0x5DABC1: SetParameter() (lefdefIO.cpp:352)
==21032== by 0x5E036C: ParseLefDef() (lefdefIO.cpp:405)
==21032== by 0x5E065D: ParseInput() (lefdefIO.cpp:241)
==21032== by 0x673483: replace_external::init_replace() (replace_external.cpp:325)
==21032== by 0x678B34: _wrap_replace_external_init_replace (replace_wrap.cpp:3240)
==21032== by 0x67C8A4: SWIG_Tcl_MethodCommand (replace_wrap.cpp:1329)
==21032== by 0x5C4AEB1: ??? (in /usr/lib64/libtcl8.5.so)
==21032== by 0x5C8F36B: ??? (in /usr/lib64/libtcl8.5.so)
==21032== by 0x5C97646: ??? (in /usr/lib64/libtcl8.5.so)
==21032== by 0x5C4C6B6: TclEvalObjEx (in /usr/lib64/libtcl8.5.so)
==21032== by 0x5C9D38B: Tcl_RecordAndEvalObj (in /usr/lib64/libtcl8.5.so)
==21032== Address 0x18 is not stack'd, malloc'd or (recently) free'd
==21032==
==21032==
==21032== Process terminating with default action of signal 11 (SIGSEGV)
==21032== Access not within mapped region at address 0x18
==21032== at 0x8BF6E0: LefDefParser::defiRow::macro() const (defiRowTrack.cpp:364)
==21032== by 0x5DABC1: SetParameter() (lefdefIO.cpp:352)
==21032== by 0x5E036C: ParseLefDef() (lefdefIO.cpp:405)
==21032== by 0x5E065D: ParseInput() (lefdefIO.cpp:241)
==21032== by 0x673483: replace_external::init_replace() (replace_external.cpp:325)
==21032== by 0x678B34: _wrap_replace_external_init_replace (replace_wrap.cpp:3240)
==21032== by 0x67C8A4: SWIG_Tcl_MethodCommand (replace_wrap.cpp:1329)
==21032== by 0x5C4AEB1: ??? (in /usr/lib64/libtcl8.5.so)
==21032== by 0x5C8F36B: ??? (in /usr/lib64/libtcl8.5.so)
==21032== by 0x5C97646: ??? (in /usr/lib64/libtcl8.5.so)
==21032== by 0x5C4C6B6: TclEvalObjEx (in /usr/lib64/libtcl8.5.so)
==21032== by 0x5C9D38B: Tcl_RecordAndEvalObj (in /usr/lib64/libtcl8.5.so)
==21032== If you believe this happened as a result of a stack
==21032== overflow in your program's main thread (unlikely but
==21032== possible), you can try to increase the size of the
==21032== main thread stack using the --main-stacksize= flag.
==21032== The main thread stack size used in this run was 8388608.
==21032==
==21032== HEAP SUMMARY:
==21032== in use at exit: 36,814,447 bytes in 1,179,547 blocks
==21032== total heap usage: 2,760,034 allocs, 1,580,487 frees, 4,678,754,460 bytes allocated
==21032==
==21032== LEAK SUMMARY:
==21032== definitely lost: 11,829,995 bytes in 569,819 blocks
==21032== indirectly lost: 2,904 bytes in 352 blocks
==21032== possibly lost: 756,049 bytes in 44 blocks
==21032== still reachable: 24,225,499 bytes in 609,332 blocks
==21032== of which reachable via heuristic:
==21032== stdstring : 1,366,181 bytes in 49,366 blocks
==21032== suppressed: 0 bytes in 0 blocks
==21032== Rerun with --leak-check=full to see details of leaked memory
==21032==
==21032== For counts of detected and suppressed errors, rerun with: -v
==21032== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
It means your DEF has a problem...
Could you check that your DEF contains the ROW definition?
No, it does not contain ROW definitions.
The ROW must be defined to run Global Placer ...
Ok, maybe I should just use the openflow as complete flow instead of QFlow.
I tried a make test on OpenROAD-Utilities/verilog-to-def and it resulted in a segfault. I re did make using gcc4.8.5 and it doesn't segfault.
Where can I ask questions regarding the tool usage?
I see capacitance and resistance units for RePlAce. What should I set as the units values? Isn't resistance and capacitance per micron, dependent on each metal layer? And has a different value for each layer...
Secondly, whats dploc flag? Google returned a result explaining the flags: RePlAce/doc/BinaryArguments.md, but apparently the file is no longer available.
I tried a make test on OpenROAD-Utilities/verilog-to-def and it resulted in a segfault. I re did make using gcc4.8.5 and it doesn't segfault.
OpenROAD-Utilities/verilog-to-def is obsolete. Could you use https://github.com/The-OpenROAD-Project/Resizer instead?
Could you please point me to a document which states the flow you follow?
I am trying to follow the demo script on yosys.
Where can I ask questions regarding the tool usage?
I see capacitance and resistance units for RePlAce. What should I set as the units values? Isn't resistance and capacitance per micron, dependent on each metal layer? And has a different value for each layer...
This is a known problem for us. For now, you may use M1 values. We're going to use a Machine-learning model or Global-Router tree model in the future.
Secondly, whats dploc flag? Google returned a result explaining the flags: RePlAce/doc/BinaryArguments.md, but apparently the file is no longer available.
-dpflag is no longer supported. you can use OpenDP for a detailed placer in our flow.
https://github.com/The-OpenROAD-Project/OpenDP
-dpflag is no longer supported.
I see reference to this flag on https://github.com/The-OpenROAD-Project/yosys
Could you please point me to a document which states the flow you follow?
I am trying to follow the demo script on yosys.
Could you check the alpha-release repositories?
-
Dockerfile for building all of OpenROAD tools.
https://github.com/The-OpenROAD-Project/alpha-release/tree/master/build/docker -
Makefile for running all of OpenROAD tools.
https://github.com/The-OpenROAD-Project/alpha-release/blob/master/flow/Makefile
-dpflag is no longer supported.
I see reference to this flag on https://github.com/The-OpenROAD-Project/yosys
You can ignore that flag now. It is only used when you're going to use physical synthesis.
The Yosys should be updated not to have -dpflag.
ALHAMDOLILLAH i managed to get the flow running. I didn't use docker, but instead ran setup.sh in released binaries, and then ran the flow from the source i downloaded from github, /alpha-release/flow
I did make in the flow directory, however, it wouldn't go to the Final/ Finish stage. I edited the Makefile statement: all: route to all: finish (I wonder, why it wasn't already...)
The final gds is attached here. Please have a look. I don't remember altering any 'area' parameter etc in the configuration files for the gcd design, but the gds seems odd. It gave me 8% utilization. Also, the power buses seem missing and the cell seem to overlap in two consecutive rows.
Am I doing something wrong...?
Sorry for my late reply.
You can change the DIEAREA in the following Makefile settings:
I will close this issue. (Your issue is out of my scope)