python preprocessor
jalvesz opened this issue ยท 11 comments
Motivation
This proposal is meant to start a discussion on a replacement for the current fpm
ci for stdlib. The idea would be to use a python script to preprocess stdlib before building with fpm or CMake. While CMake already has a customized fypp preprocessor, such script could serve as a replacement using a PRE_BUILD
action with add_custom_command
. Also, currently the fpm
branch lacks the means for a flexible way of adding dependencies in the toml file. This proposal would try to remedy these shortcomings.
Prior Art
No response
Additional Information
Proposal
Say a fypp_deployement.py
file is at the root of stdlib:
import os
import fypp
import argparse
from joblib import Parallel, delayed
def pre_process_toml(kargs):
"""
Pre-process the fpm.toml
"""
from tomlkit import table, dumps
data = table()
data.add("name", "stdlib")
data.add("version", str(kargs.vmajor)+
"."+str(kargs.vminor)+
"."+str(kargs.vpatch) )
data.add("license", "MIT")
data.add("author", "stdlib contributors")
data.add("maintainer", "@fortran-lang/stdlib")
data.add("copyright", "2019-2021 stdlib contributors")
if(kargs.with_blp):
build = table()
build.add("link", ["lapack", "blas"] )
data.add("build", build)
dev_dependencies = table()
dev_dependencies.add("test-drive", {"git" : "https://github.com/fortran-lang/test-drive",
"tag" : "v0.4.0"})
data.add("dev-dependencies", dev_dependencies)
preprocess = table()
preprocess.add("cpp", {} )
preprocess['cpp'].add("suffixes", [".F90", ".f90", ".fypp"] )
preprocess['cpp'].add("macros", ["MAXRANK="+str(kargs.maxrank),
"PROJECT_VERSION_MAJOR="+str(kargs.vmajor),
"PROJECT_VERSION_MINOR="+str(kargs.vminor),
"PROJECT_VERSION_PATCH="+str(kargs.vpatch)] )
data.add("preprocess", preprocess)
with open("fpm.toml", "w") as f:
f.write(dumps(data))
C_PREPROCESSED = (
"stdlib_linalg_constants" ,
"stdlib_linalg_blas" ,
"stdlib_linalg_blas_aux",
"stdlib_linalg_blas_s",
"stdlib_linalg_blas_d",
"stdlib_linalg_blas_q",
"stdlib_linalg_blas_c",
"stdlib_linalg_blas_z",
"stdlib_linalg_blas_w",
"stdlib_linalg_lapack",
"stdlib_linalg_lapack_aux",
"stdlib_linalg_lapack_s",
"stdlib_linalg_lapack_d",
"stdlib_linalg_lapack_q",
"stdlib_linalg_lapack_c",
"stdlib_linalg_lapack_z",
"stdlib_linalg_lapack_w"
)
def pre_process_fypp(kargs):
kwd = []
kwd.append("-DMAXRANK="+str(kargs.maxrank))
kwd.append("-DPROJECT_VERSION_MAJOR="+str(kargs.vmajor))
kwd.append("-DPROJECT_VERSION_MINOR="+str(kargs.vminor))
kwd.append("-DPROJECT_VERSION_PATCH="+str(kargs.vpatch))
if kargs.with_qp:
kwd.append("-DWITH_QP=True")
if kargs.with_xqp:
kwd.append("-DWITH_XQP=True")
optparser = fypp.get_option_parser()
options, leftover = optparser.parse_args(args=kwd)
options.includes = ['include']
# options.line_numbering = True
tool = fypp.Fypp(options)
# Define the folders to search for *.fypp files
folders = ['src', 'test', 'example']
# Process all folders
fypp_files = [os.path.join(root, file) for folder in folders
for root, _, files in os.walk(folder)
for file in files if file.endswith(".fypp")]
def process_f(file):
source_file = file
basename = os.path.splitext(source_file)[0]
sfx = 'f90' if os.path.basename(basename) not in C_PREPROCESSED else 'F90'
target_file = basename + '.' + sfx
tool.process_file(source_file, target_file)
Parallel(n_jobs=kargs.njob)(delayed(process_f)(f) for f in fypp_files)
return
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Preprocess stdlib source files.')
# fypp arguments
parser.add_argument("--vmajor", type=int, default=0, help="Project Version Major")
parser.add_argument("--vminor", type=int, default=4, help="Project Version Minor")
parser.add_argument("--vpatch", type=int, default=0, help="Project Version Patch")
parser.add_argument("--njob", type=int, default=4, help="Number of parallel jobs for preprocessing")
parser.add_argument("--maxrank",type=int, default=7, help="Set the maximum allowed rank for arrays")
parser.add_argument("--with_qp",type=bool, default=False, help="Include WITH_QP in the command")
parser.add_argument("--with_xqp",type=bool, default=False, help="Include WITH_XQP in the command")
# external libraries arguments
parser.add_argument("--with_blp",type=bool, default=False, help="Link against OpenBLAS")
args = parser.parse_args()
pre_process_toml(args)
pre_process_fypp(args)
Example:
python fypp_deployment.py --with_blp 1
Would produce a fpm.toml:
name = "stdlib"
version = "0.4.0"
license = "MIT"
author = "stdlib contributors"
maintainer = "@fortran-lang/stdlib"
copyright = "2019-2021 stdlib contributors"
[build]
link = ["lapack", "blas"]
[dev-dependencies]
[dev-dependencies.test-drive]
git = "https://github.com/fortran-lang/test-drive"
tag = "v0.4.0"
[preprocess]
[preprocess.cpp]
suffixes = [".F90", ".f90", ".fypp"]
macros = ["MAXRANK=7", "PROJECT_VERSION_MAJOR=0", "PROJECT_VERSION_MINOR=4", "PROJECT_VERSION_PATCH=0"]
Current limitations
- In this first example the fypp files are being preprocessed into .f90 files at the same location. Should they go somewhere else?
I've made updates to the script, it now runs in parallel, takes into account c_preprocessed files to add the suffix .F90
.
Running fpm build
or fpm test
from the root will effectively run as a fpm.toml
is available.
- Where should processed files end up if this strategy is to be adopted? (ok to leave them next to fypp files or a dedicated subfolder should be provisioned?)
- When running the tests there are a bunch of files created that end up in the root... maybe a default
dump
directory should be set for all tests?
I believe a great starting point would be that the preprocessing script allows both CMake and the stdlib-fpm build to achieve exactly the same output as they have now. In other words, it would be the unique source to be modified when changes are introduced to the folder structure (remove customized commands from fpm-deployment
)
Totally agree! Adding such script to CMake could look something like:
add_custom_command(
TARGET stdlib
PRE_BUILD
COMMAND python ${CMAKE_CURRENT_SOURCE_DIR}/fypp_deployement.py)
What I haven't figured out (tried) yet is if we could do something like
fpm build fypp_deployment.py [--options]
To have the fpm
command line profit from it.
I think I misunderstand the aim of this script. So my questions may sound strange.
Totally agree! Adding such script to CMake could look some like:
Would this replace the call to fypp
inside CMake?
What I haven't figured out (tried) yet is if we could do something like
fpm build fypp_deployment.py [--options]
To have the
fpm
command line profit from it.
Since the script fypp_deployment.py
generate the fpm.toml
file, how such a strategy would work? I guess fypp_deployement.py
should be run first, followed by fpm
?
My script could be used as fpm build --compiler fypp_gfortran.py
and would use the content of the existing fpm.toml
to preprocess the files accordingly. However, fypp_deployement.py
has a different aim, right (e.g., to generate the files in stdlib-fpm
branch)?
Your question is totally valid:
Would this replace the call to fypp inside CMake?
It could, I'm starting to test this approach with a smaller project and it works: I can use the same script to preprocess the fypp files before compiling with fpm or building with CMake. Up to us to decide if that would be a good solution, I'm starting to think that yes, but I would like to hear your opinion.
What I find honestly extremely satisfying is that on Windows, with MVS this PRE_BUILD command is registered, so when I click on rebuild the project it launches it. So I can focus on editing the fypp
and have everything running smoothly. And with VSCode I'll just use the CLI with fpm
with two commands.
Since the script fypp_deployment.py generate the fpm.toml file, how such a strategy would work? I guess fypp_deployement.py should be run first, followed by fpm?
Yes, right now I use it in two steps, first do all the preprocessing then build. My next target was to include what you have done with your script to also include the building step (combine both ideas). But before going there, I wanted to get some feed back.
One of the limitations that has been discussed with the fpm
branch and the fpm.toml
is that it is static. No way of conditionally linking against OpenBLAS or MKL or other libs. Other conditional options might be missing as well, for instance the maxrank
. By regenerating the manifest file with the script it could enable having the same flexibility as with CMake.
I've made a single script to show the different steps, but as you can see, there are already 2 different steps that could be decoupled pre_process_toml(args)
and pre_process_fypp(args)
a third step process
or build
could be included to complete the process.
UPDATE
Here is an updated version that manages 3 steps pre_process_toml
, pre_process_fypp
and fpm_build
which enables to do something like the following from the root of the main branch of stdlib:
python fypp_deployment.py --with_qp 1 --flag "-O3 -march=native"
And it will: 1. regenerate the fpm.toml
, 2. preprocess all the fypp
files creating a .f90
or .F90
just next to the original with quad precision, 3. Build stdlib
using the user flags.
For CMake
only the pre_process_fypp
is needed if we replace the fypp
preprocessing there in. I'm already thinking about splitting this into 3 modules.
@jvdp1 I tried to adapt what you did for building, I wanted to make it such that it is not compiler dependent, but that the compiler can be predefined... I'm still a bit at loss with all the possibilities of the fpm
CLI to see how could I make the script callable from it and ensure the 3 steps, Instead of doing what I did here, call fpm build
from within with subprocess ...
import os
import fypp
import argparse
from joblib import Parallel, delayed
def pre_process_toml(args):
"""
Pre-process the fpm.toml
"""
from tomlkit import table, dumps
data = table()
data.add("name", "stdlib")
data.add("version", str(args.vmajor)+
"."+str(args.vminor)+
"."+str(args.vpatch) )
data.add("license", "MIT")
data.add("author", "stdlib contributors")
data.add("maintainer", "@fortran-lang/stdlib")
data.add("copyright", "2019-2021 stdlib contributors")
if(args.with_blp):
build = table()
build.add("link", ["lapack", "blas"] )
data.add("build", build)
dev_dependencies = table()
dev_dependencies.add("test-drive", {"git" : "https://github.com/fortran-lang/test-drive",
"tag" : "v0.4.0"})
data.add("dev-dependencies", dev_dependencies)
preprocess = table()
preprocess.add("cpp", {} )
preprocess['cpp'].add("suffixes", [".F90", ".f90", ".fypp"] )
preprocess['cpp'].add("macros", ["MAXRANK="+str(args.maxrank),
"PROJECT_VERSION_MAJOR="+str(args.vmajor),
"PROJECT_VERSION_MINOR="+str(args.vminor),
"PROJECT_VERSION_PATCH="+str(args.vpatch)] )
data.add("preprocess", preprocess)
with open("fpm.toml", "w") as f:
f.write(dumps(data))
return
C_PREPROCESSED = (
"stdlib_linalg_constants" ,
"stdlib_linalg_blas" ,
"stdlib_linalg_blas_aux",
"stdlib_linalg_blas_s",
"stdlib_linalg_blas_d",
"stdlib_linalg_blas_q",
"stdlib_linalg_blas_c",
"stdlib_linalg_blas_z",
"stdlib_linalg_blas_w",
"stdlib_linalg_lapack",
"stdlib_linalg_lapack_aux",
"stdlib_linalg_lapack_s",
"stdlib_linalg_lapack_d",
"stdlib_linalg_lapack_q",
"stdlib_linalg_lapack_c",
"stdlib_linalg_lapack_z",
"stdlib_linalg_lapack_w"
)
def pre_process_fypp(args):
kwd = []
kwd.append("-DMAXRANK="+str(args.maxrank))
kwd.append("-DPROJECT_VERSION_MAJOR="+str(args.vmajor))
kwd.append("-DPROJECT_VERSION_MINOR="+str(args.vminor))
kwd.append("-DPROJECT_VERSION_PATCH="+str(args.vpatch))
if args.with_qp:
kwd.append("-DWITH_QP=True")
if args.with_xqp:
kwd.append("-DWITH_XQP=True")
print(kwd)
optparser = fypp.get_option_parser()
options, leftover = optparser.parse_args(args=kwd)
options.includes = ['include']
# options.line_numbering = True
tool = fypp.Fypp(options)
# Define the folders to search for *.fypp files
folders = ['src', 'test']
# Process all folders
fypp_files = [os.path.join(root, file) for folder in folders
for root, _, files in os.walk(folder)
for file in files if file.endswith(".fypp")]
def process_f(file):
source_file = file
root = os.path.dirname(file)
basename = os.path.splitext(os.path.basename(source_file))[0]
sfx = 'f90' if basename not in C_PREPROCESSED else 'F90'
target_file = root + os.sep + basename + '.' + sfx
tool.process_file(source_file, target_file)
Parallel(n_jobs=args.njob)(delayed(process_f)(f) for f in fypp_files)
return
def fpm_build(unknown):
import subprocess
#==========================================
# check compilers
if "FPM_FC" in os.environ:
FPM_FC = os.environ['FPM_FC']
if "FPM_CC" in os.environ:
FPM_CC = os.environ['FPM_CC']
if "FPM_CXX" in os.environ:
FPM_CXX = os.environ['FPM_CXX']
#==========================================
# Filter out the macro definitions.
macros = [arg for arg in unknown if arg.startswith("-D")]
# Filter out the include paths with -I prefix.
include_paths = [arg for arg in unknown if arg.startswith("-I")]
# Filter out flags
flags = " "
for idx, arg in enumerate(unknown):
if arg.startswith("--flag"):
flags= unknown[idx+1]
#==========================================
# build with fpm
subprocess.run(["fpm build"]+[" --flag "]+[flags], shell=True, check=True)
return
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Preprocess stdlib source files.')
# fypp arguments
parser.add_argument("--vmajor", type=int, default=0, help="Project Version Major")
parser.add_argument("--vminor", type=int, default=4, help="Project Version Minor")
parser.add_argument("--vpatch", type=int, default=0, help="Project Version Patch")
parser.add_argument("--njob", type=int, default=4, help="Number of parallel jobs for preprocessing")
parser.add_argument("--maxrank",type=int, default=7, help="Set the maximum allowed rank for arrays")
parser.add_argument("--with_qp",type=bool, default=False, help="Include WITH_QP in the command")
parser.add_argument("--with_xqp",type=bool, default=False, help="Include WITH_XQP in the command")
# external libraries arguments
parser.add_argument("--with_blp",type=bool, default=False, help="Link against OpenBLAS")
args, unknown = parser.parse_known_args()
#==========================================
# pre process the fpm manifest
pre_process_toml(args)
#==========================================
# pre process the meta programming fypp files
pre_process_fypp(args)
#==========================================
# build using fpm
fpm_build(unknown)
Update: I'm tracking this in the following branch https://github.com/jalvesz/stdlib/tree/deployment
The first target I'm trying to achieve is to replace the fpm-deployement.sh script by the use of the python script to generate the stdlib-fpm
folder and do the fypp preprocessing on the fly. The script is in the ci
folder but it might have a better place in config
?
This python ci/fypp_deployment.py
creates the stdlib-fpm
folder and preprocesses on the fly.
This python ci/fypp_deployment.py --destdir "."
would consider an "in-place" preprocessing so the fypp files are preprocessed at their location in the root folder.
This python ci/fypp_deployment.py --build 1
would preprocess and build using fpm with the compiler defined in an env variable using FPM_FC = os.environ['FPM_FC'] if "FPM_FC" in os.environ else "gfortran"
The first target I'm trying to achieve is to replace the fpm-deployement.sh script by the use of the python script to generate the
stdlib-fpm
folder and do the fypp preprocessing on the fly. The script is in theci
folder but it might have a better place inconfig
?
If it is a wider use than just for the CI/CD, then I agree it should be moved to the config directory.
This
python ci/fypp_deployment.py
creates thestdlib-fpm
folder and preprocesses on the fly. Thispython ci/fypp_deployment.py --destdir "."
would consider an "in-place" preprocessing so the fypp files are preprocessed at their location in the root folder.
What is the advantage to preprocess the files in-place? Generated .f90
files won't be tracked by git
and a git clean -f
won't work.
If it is a wider use than just for the CI/CD, then I agree it should be moved to the config directory.
Perfect! I'll move it there then
What is the advantage to preprocess the files in-place?
What I would like to achieve is being able to work on the .fypp
files and launch the build process (which should trigger the preprocessing automatically) in a "one-click" style. Instead of having to move between folders. Basically to do fpm build
at the root folder. I could instead create on the fly the following src\temp
test\temp
, and dump the .f90
files within having this temp
subfolder ignored by git.
I've used this to address my issue in #796. It was really useful to have a python script as the computer I'm working on is fairly restricted (If I install fypp via pip, I can't run it via command line, but can import it into python). The 'standard' fpm-deployment.sh would hence not work for me.
My only real issue was that I had to edit the script to point directly at my fpm as it's not on my path (for similar reasons, I use a pre-compiled binary).
I was then able to use it in my FPM project using i.e.
[dependencies]
stdlib = {path = 'temp/stdlib/stdlib-fpm/'}
in my project's fpm.toml
The script also doesn't handle Cray compilers or environment, but that's not unexpected really and FPM doesn't handle them anyway so I switched to the GCC ones.
Thanks!