Using GHC API to compile Haskell sources to CORE and CORE to binary
Asked Answered
O

2

20

The Idea

Hello! I want to create a program, that will generate Haskell Core and will use GHC API to compile it further into an executable. But before I will do it I want to construct a very basic example, showing how can we just compile Haskell sources into CORE and then into the binary file.

The problem

I have read a lot of documentation and tried many methods from GHC Api, but for now without success. I started with Official GHC Api introduction and successfully compiled the examples. The examples show the usage of the following functions: parseModule, typecheckModule, desugarModule, getNamesInScope and getModuleGraph but does not cover the final compilation step. On the other hand, there are some functions in the api, whose names look related to the problem, like HscMain.{hscCompileOneShot, hscCompileBatch} or GHC.{compileToCoreModule, compileCoreToObj}. I tried to use them, but I get runtime errors, like in this example:

import GHC
import GHC.Paths ( libdir )
import DynFlags
targetFile = "Test.hs"

main :: IO ()
main = do
   res <- example
   return ()

example = 
    defaultErrorHandler defaultFatalMessager defaultFlushOut $ do
      runGhc (Just libdir) $ do
        dflags <- getSessionDynFlags
        let dflags' = foldl xopt_set dflags
                            [Opt_Cpp, Opt_ImplicitPrelude, Opt_MagicHash]
        setSessionDynFlags dflags'
        coreMod <- compileToCoreModule targetFile
        compileCoreToObj False coreMod "foo" "bar"
        return () 

which can be compiled with ghc -package ghc Main.hs and which results in the following error during runtime:

Main: panic! (the 'impossible' happened)
  (GHC version 7.8.3 for x86_64-unknown-linux):
    expectJust mkStubPaths

which of course can be the result of wrong API usage, in particular, because of line compileCoreToObj False coreMod "foo" "bar", wher the string are just random ones, because the documentation does not say much about them. If we look into the sources, it seems, that the first one is the output name and the second one is "extCore_filename", whatever it could be.

Another worrying thing is the comment in the documentation next to the compileCoreToObj function:

[...] This has only so far been tested with a single self-contained module.

But I hope it will not introduce any further problems.

The question

What is the best possible way to create this solution? How can we create a minimal working example, that will load haskell sources, compile them into the CORE and then compile the core to final executable (using the GHC API). The intermediate step is needed for further replacement by custom CORE.

As a side-question - is it currently possible to provide GHC with external core files or this feature is not implemented yet and I will have to construct the Core manually, using GHC.Api (related to: Compiling to GHC Core)

Update

I was finally able to create a small example allowing loading a module and compiling it to .hi and .o files. This is not a solution for the problem, because it does not allow me to replace the CORE and it does not link the object files into executables yet:

import GHC
import GHC.Paths ( libdir )
import DynFlags
import Linker
import Module
targetFile = "Test.hs"

main :: IO ()
main = do
   res <- example
   return ()

example = 
    defaultErrorHandler defaultFatalMessager defaultFlushOut $ do
      runGhc (Just libdir) $ do
        dflags <- getSessionDynFlags
        let dflags2 = dflags { ghcLink   = LinkBinary
                             , hscTarget = HscAsm
                             }
        let dflags' = foldl xopt_set dflags2
                            [Opt_Cpp, Opt_ImplicitPrelude, Opt_MagicHash]
        setSessionDynFlags dflags'
        setTargets =<< sequence [guessTarget "Test.hs" Nothing]

        load LoadAllTargets
        return ()
Oliviero answered 21/1, 2015 at 4:34 Comment(3)
The devs of the ghc mailing list are probably the ones with the answers you are looking for.Taxi
I believe there's been work done recently on allowing plugins to do core-to-core transformation passes within the GHC pipeline; not exactly what you want though. Perhaps you could explain briefly why you want to do this precise workflow?Banker
@ChristianConkle I do not want CORE -> CORE plugin. I'm creating my custom language and want to compile it to core and then use GHC pipeline. Sorry for being unclear.Oliviero
S
5

Generating textual representation of core is not a problem here, because it can be made in multiple ways. You can use -fext-core flag to generate .hcr files and work with them using e.g. extcore. There are also other packages that can dump core, like ghc-core or ghc-core-html.

The main problem here, is loading ghc-core into ghc. As far as I know it was supported but now is not, because there was low interest in using it and it became obsolete with time.

The best thing we can try here is dig more into ghc internals, find places where ghc-core is used and try to modify it there. Maybe we can also try creating a ghc plugin and modify core with it.

Seychelles answered 30/1, 2015 at 13:30 Comment(1)
I'm not interested in creating the textual core, because I want to generate it's AST. Loading external core to GHC oculd be solution, but as you have written it is not maintained anymore - the same answer I got on the mailing list. I'm still very interested in getting the "real" solution, but for now I will just drop the bounty here, because the time is over :(Oliviero
D
-2

Short answer: once you have to object file you use the c compiler of your choice to compile a main stub and link into an executable.

If you have the object file, then the last steps that GHC would do are done in the linker and a C compiler. For instance, by setting the -verbose flag and -keep-tmp-files for a simple hello_world, the last three steps for me, after building the objects, were:

'/usr/bin/gcc' '-fno-stack-protector' '-Wl,--hash-size=31' '-Wl,--reduce-memory-overheads' '-c' '/tmp/ghc29076_0/ghc29076_0.c' '-o' '/tmp/ghc29076_0/ghc29076_0.o' '-DTABLES_NEXT_TO_CODE' '-I/usr/lib/ghc/include'
*** C Compiler:
'/usr/bin/gcc' '-fno-stack-protector' '-Wl,--hash-size=31' '-Wl,--reduce-memory-overheads' '-c' '/tmp/ghc29076_0/ghc29076_0.s' '-o' '/tmp/ghc29076_0/ghc29076_1.o' '-DTABLES_NEXT_TO_CODE' '-I/usr/lib/ghc/include'
*** Linker:
'/usr/bin/gcc' '-fno-stack-protector' '-Wl,--hash-size=31' '-Wl,--reduce-memory-overheads' '-o' 'hello' 'hello.o' '-L/usr/lib/ghc/base-4.6.0.1' '-L/usr/lib/ghc/integer-gmp-0.5.0.0' '-L/usr/lib/ghc/ghc-prim-0.3.0.0' '-L/usr/lib/ghc' '/tmp/ghc29076_0/ghc29076_0.o' '/tmp/ghc29076_0/ghc29076_1.o' '-lHSbase-4.6.0.1' '-lHSinteger-gmp-0.5.0.0' '-lgmp' '-lHSghc-prim-0.3.0.0' '-lHSrts' '-lffi' '-lm' '-lrt' '-ldl' '-u' 'ghczmprim_GHCziTypes_Izh_static_info' '-u' 'ghczmprim_GHCziTypes_Czh_static_info' '-u' 'ghczmprim_GHCziTypes_Fzh_static_info' '-u' 'ghczmprim_GHCziTypes_Dzh_static_info' '-u' 'base_GHCziPtr_Ptr_static_info' '-u' 'ghczmprim_GHCziTypes_Wzh_static_info' '-u' 'base_GHCziInt_I8zh_static_info' '-u' 'base_GHCziInt_I16zh_static_info' '-u' 'base_GHCziInt_I32zh_static_info' '-u' 'base_GHCziInt_I64zh_static_info' '-u' 'base_GHCziWord_W8zh_static_info' '-u' 'base_GHCziWord_W16zh_static_info' '-u' 'base_GHCziWord_W32zh_static_info' '-u' 'base_GHCziWord_W64zh_static_info' '-u' 'base_GHCziStable_StablePtr_static_info' '-u' 'ghczmprim_GHCziTypes_Izh_con_info' '-u' 'ghczmprim_GHCziTypes_Czh_con_info' '-u' 'ghczmprim_GHCziTypes_Fzh_con_info' '-u' 'ghczmprim_GHCziTypes_Dzh_con_info' '-u' 'base_GHCziPtr_Ptr_con_info' '-u' 'base_GHCziPtr_FunPtr_con_info' '-u' 'base_GHCziStable_StablePtr_con_info' '-u' 'ghczmprim_GHCziTypes_False_closure' '-u' 'ghczmprim_GHCziTypes_True_closure' '-u' 'base_GHCziPack_unpackCString_closure' '-u' 'base_GHCziIOziException_stackOverflow_closure' '-u' 'base_GHCziIOziException_heapOverflow_closure' '-u' 'base_ControlziExceptionziBase_nonTermination_closure' '-u' 'base_GHCziIOziException_blockedIndefinitelyOnMVar_closure' '-u' 'base_GHCziIOziException_blockedIndefinitelyOnSTM_closure' '-u' 'base_ControlziExceptionziBase_nestedAtomically_closure' '-u' 'base_GHCziWeak_runFinalizzerBatch_closure' '-u' 'base_GHCziTopHandler_flushStdHandles_closure' '-u' 'base_GHCziTopHandler_runIO_closure' '-u' 'base_GHCziTopHandler_runNonIO_closure' '-u' 'base_GHCziConcziIO_ensureIOManagerIsRunning_closure' '-u' 'base_GHCziConcziSync_runSparks_closure' '-u' 'base_GHCziConcziSignal_runHandlers_closure'

Looking at those first two files reveals that the c file is just:

#include "Rts.h"
extern StgClosure ZCMain_main_closure;
int main(int argc, char *argv[])
{
    RtsConfig __conf = defaultRtsConfig;
    __conf.rts_opts_enabled = RtsOptsSafeOnly;
   return hs_main(argc, argv, &ZCMain_main_closure,__conf);
}

Doesn't seem like that should change much from project to project.

The assembly file is:

 .section .debug-ghc-link-info,"",@note
 .ascii "([\"-lHSbase-4.6.0.1\",\"-lHSinteger-gmp-0.5.0.0\",\"-lgmp\",\"-lHSghc-prim-0.3.0.0\",\"-lHSrts\",\"-lffi\    ",\"-lm\",\"-lrt\",\"-ldl\",\"-u\",\"ghczmprim_GHCziTypes_Izh_static_info\",\"-u\",\"ghczmprim_GHCziTypes_Czh_static    _info\",\"-u\",\"ghczmprim_GHCziTypes_Fzh_static_info\",\"-u\",\"ghczmprim_GHCziTypes_Dzh_static_info\",\"-u\",\"bas    e_GHCziPtr_Ptr_static_info\",\"-u\",\"ghczmprim_GHCziTypes_Wzh_static_info\",\"-u\",\"base_GHCziInt_I8zh_static_info    \",\"-u\",\"base_GHCziInt_I16zh_static_info\",\"-u\",\"base_GHCziInt_I32zh_static_info\",\"-u\",\"base_GHCziInt_I64z    h_static_info\",\"-u\",\"base_GHCziWord_W8zh_static_info\",\"-u\",\"base_GHCziWord_W16zh_static_info\",\"-u\",\"base    _GHCziWord_W32zh_static_info\",\"-u\",\"base_GHCziWord_W64zh_static_info\",\"-u\",\"base_GHCziStable_StablePtr_stati    c_info\",\"-u\",\"ghczmprim_GHCziTypes_Izh_con_info\",\"-u\",\"ghczmprim_GHCziTypes_Czh_con_info\",\"-u\",\"ghczmpri    m_GHCziTypes_Fzh_con_info\",\"-u\",\"ghczmprim_GHCziTypes_Dzh_con_info\",\"-u\",\"base_GHCziPtr_Ptr_con_info\",\"-u\    ",\"base_GHCziPtr_FunPtr_con_info\",\"-u\",\"base_GHCziStable_StablePtr_con_info\",\"-u\",\"ghczmprim_GHCziTypes_Fal    se_closure\",\"-u\",\"ghczmprim_GHCziTypes_True_closure\",\"-u\",\"base_GHCziPack_unpackCString_closure\",\"-u\",\"b    ase_GHCziIOziException_stackOverflow_closure\",\"-u\",\"base_GHCziIOziException_heapOverflow_closure\",\"-u\",\"base    _ControlziExceptionziBase_nonTermination_closure\",\"-u\",\"base_GHCziIOziException_blockedIndefinitelyOnMVar_closur    e\",\"-u\",\"base_GHCziIOziException_blockedIndefinitelyOnSTM_closure\",\"-u\",\"base_ControlziExceptionziBase_neste    dAtomically_closure\",\"-u\",\"base_GHCziWeak_runFinalizzerBatch_closure\",\"-u\",\"base_GHCziTopHandler_flushStdHan    dles_closure\",\"-u\",\"base_GHCziTopHandler_runIO_closure\",\"-u\",\"base_GHCziTopHandler_runNonIO_closure\",\"-u\"    ,\"base_GHCziConcziIO_ensureIOManagerIsRunning_closure\",\"-u\",\"base_GHCziConcziSync_runSparks_closure\",\"-u\",\"    base_GHCziConcziSignal_runHandlers_closure\"],[],Nothing,RtsOptsSafeOnly,False,[],[])"

Well that's a little worse, but it looks like those are a list of linker flags with some gobbledygook that was passed into GHC at the end. I'm not sure what all the things that the linker is undefining, and looking at linker flags will be your biggest homework. Will you have to modify those flags? Maybe, and maybe only if dependencies change.

Durwin answered 28/1, 2015 at 20:52 Comment(2)
As for the other part of your question, replacing the CORE with another, you're probably out of luck, and I'm not even quite sure what your aim is there.Durwin
hmm, ok, maybe this is the way to go to link the object file manually, but I hope GHC has some routines to do this for us. I mean - sometimes this might change and supporting it would be much easier if it would be handled by ghc. @Durwin I want to compile my custom language to CORE and then just proceed with normal GHC pipeline.Oliviero

© 2022 - 2024 — McMap. All rights reserved.