many thanks for a very informative response; I know little about system programming. I don't dare to make something very generic and secure. My initial thought is that this wrapper can ease the task converting a tex file to something else using pdftex or a tex-related program (I wanted to use the name 'runtex' for the wrapper but googling it found too many results). A developer who is not a tex guru can deploy this task via a well-defined API, and tex-specific issues can be (partly) hidden from him. I assume a few things as: - the application using runptex should not have setuid/setgid flag, and should not be run by root or a privileged user. - it's the responsibility of the application to ensure that the working dir are safe; for example the working dir must not be world-writable like /tmp, but rather a directory belongs to the process owner like $HOME/tmp or something similar. - there are some restrictions about paths and filenames: filename (without path) can contain only letters, digits and [-_.], path can contain the same plus path separator and space. - the main target plaform is linux; support for other platform is not mandatory if it's too difficult to do. Do you find those assumptions reasonable, or they are too much?
I assume your 'run_pdftex' interface is synchronous. IMO It would be at least required to have an asynchronous version as well. I.e., a version where you initiate the start and then later independently query and if necessary wait for the result. The reason is obvious: the program can do work on its own while TeX is running. Parallelism is extremely important going forward.
point taken
And an implementation detail: _never_ expose data structures unless it is really, *REALLY* needed. I'm talking here about the pdftex_data_struct, of course. [...]
point taken
case (BTW: why not return an error code and not just success/failure information from the functions, then you don't have to pass a pointer to the tmp variable to pds_print_error).
I want to keep the error code and message inside the data structure, so that application can decide what to do with them. I also find it handy to for error handling in subroutines. For example if I use a check_path() subroutine inside init_pdftex_data() to check whether a path is valid, I can set the error code and message inside the check_path() function, which is simpler than to examine the return code of check_path() in init_pdftex_data(). And a simple return code is not enough informative, one often needs to report more detailed information than a return code, something like: opening tex file `/home/thanh/tex/foo.tex' for reading failed: file not found.
Anyway, if you make this change the information about the struct is completely encapsulated in your code. This is important for maintainability since it gives you the opportunity to change the implementation as much as you want as long as the function interfaces remain the same.
agree
About pds_print_error_and_exit: such an interface is usually not useful except in tiny little programs. Assume you write a graphical shell for TeX. You don't want to terminate the program after a failed run, the user should be able to fix problems and rerun. What is needed, though, is the ability to show an error string. So, what maybe is needed is to have a function which returns an error string which can be printed in the appropriate way (on terminal, in dialog box, whatever).
yes this function is little useful -- it's intended for testing/debugging purpose. Applications should handle the error in their way, using the error code and message stored in pdftex_data_struct.
About the interface naming: C's flat namespace is crowded. To minimize the risk of conflicts you should standardize on a common prefix for all function and type names and stick with it. E.g.,
pdftex_data_struct -> pdftexlib_data_struct init_pdftex_data -> pdftexlib_data_init pds_print_error_and_exit -> pdftexlib_error run_pdftex -> pdftexlib_run
you get the idea.
agree. I will add runptex_ prefix to all relevant names. Thanh