How to compile Haskell into the untyped lambda calculus (or GHC core)?
Asked Answered
M

1

9

I'm looking for ways how to convert a simple Haskell program (no imported libraries, just data types and pure functions) into a term of the untyped lambda calculus. A promising approach seems to be to use GHC API to compile a program into GHC core, which can be then converted into the untyped lambda calculus.

How to use GHC API to load a Haskell program and compile it into Core?

Metalline answered 24/12, 2014 at 10:5 Comment(6)
ghc has an option to output core - ghc -ddump-simpl. Also take a look at (ghc-core)[hackage.haskell.org/package/ghc-core], which basically just wraps ddump-simpl with some postprocessing.Gstring
@Gstring That's true, but as I understand, ghc-core gives me a human-readable output that I'd further need to parse. Instead I'd like to get the core expression as a data structure to analyze it further. But it could be a good staring point, thanks.Metalline
That human-readable output also isn't designed to be parsed, and may not be terribly stable from version to version.Gendarmerie
There are a number of related attempts at the core version of this question and attempted answers on stack overflow. One answer appears to desugar a simple Haskell program into core with an older version of ghc. A newer question asks how to compile to core and then from core to binary.Mitch
Untyped lambda calculus has only functions, no data. While you can encode data as functions, ghc cote won't help you in that.Acanthocephalan
@n.m. That's perfectly fine, I just want to get as close as possible to the untyped lambda calculus (core) and take over from there.Metalline
M
7

From the GHC module documentation in the ghc docs:

compileToCoreModule :: GhcMonad m => FilePath -> m CoreModule

This is the way to get access to the Core bindings corresponding to a module. compileToCore parses, typechecks, and desugars the module, then returns the resulting Core module (consisting of the module name, type declarations, and function declarations) if successful.

compileToCoreSimplified :: GhcMonad m => FilePath -> m CoreModule

Like compileToCoreModule, but invokes the simplifier, so as to return simplified and tidied Core.

I found this by looking through the list of GHC modules, noticing the Desugar module, noticing ModGuts in the result of deSugar, downloading all of the documentation, and searching the text for ModGuts.

Minimal Example

Our example will compile a simple module so we can see what the core looks like. It uses ghc-paths to provide the location of the ghc libs directory. The core will be represented in memory by a CoreModule containing a list of CoreBinds. We can't dump the AST directly because there aren't Show instances for the AST described in CoreSyn, however the Outputable instance for CoreModule will pretty-print the core so we can see that we compiled to core.

import GHC
import DynFlags
import Outputable (Outputable, showPpr)
import qualified GHC.Paths as Paths

import Data.Functor

runGhc' takes care of all the setup needed for compiling to core a module with no imports and no TemplateHaskell. We completely turn off the linker with NoLink and tell the compiler to produce nothing with HscNothing.

runGhc' :: Ghc a -> IO a
runGhc' ga = do
    runGhc (Just Paths.libdir) $ do
        dflags <- getDynFlags
        let dflags2 = dflags { ghcLink   = NoLink
                             , hscTarget = HscNothing
                             }
        setSessionDynFlags dflags2
        ga

Compiling a module to core consists of setting the target with guessTarget and addTarget, optionally loading dependencies with load, building the module graph with depanel, finding the correct ModSummary in the module graph, parsing the module with parseModule, type checking it with typecheckModule, desugarring it with desugarModule, converting it to ModGuts with coreModule from the DesugaredMod instance for the result of desugarring, and extracting the core from the ModGuts. All of this is wrapped up in a nice package by compileToCoreModule.

compileExample :: Ghc CoreModule
compileExample = compileToCoreModule "prettyPrint2dList.hs"

Our whole example program will output the core with showPpr.

showPpr' :: (Functor m, Outputable a, HasDynFlags m) => a -> m String
showPpr' a = (flip showPpr) a <$> getDynFlags

main = runGhc' (compileExample >>= showPpr') >>= putStrLn

Compiling the above example requires the -package ghc flag to expose the normally hidden ghc api package.

The example module we'll compile to core, "prettyPrint2dList.hs", contains a data type and a bit of code that uses functions from the prelude.

data X = Y | Z
         deriving (Eq, Show)

type R = [X]
type W = [R]

example = map (\x -> take x (cycle [Y, Z])) [0..]

main = undefined

Which produces a slew of pretty-printed core.

%module main:Main (Safe-Inferred) [01D :-> Identifier `:Main.main',
                                   a1f2 :-> Identifier `$c==', a1f5 :-> Identifier `$c/=',
                                   a1fb :-> Identifier `$cshowsPrec', a1fh :-> Identifier `$cshow',
                                   a1fk :-> Identifier `$cshowList',
                                   r0 :-> Identifier `Main.$fShowX', r1 :-> Identifier `Main.$fEqX',
                                   r2 :-> Type constructor `Main.R',
                                   r3 :-> Type constructor `Main.X', r4 :-> Identifier `Main.main',
                                   rqS :-> Type constructor `Main.W',
                                   rrS :-> Data constructor `Main.Y', rrV :-> Identifier `Main.Y',
                                   rrW :-> Data constructor `Main.Z', rrX :-> Identifier `Main.Z',
                                   rL2 :-> Identifier `Main.example']
Main.example :: [[Main.X]]
[LclIdX, Str=DmdType]
Main.example =
  GHC.Base.map
    @ GHC.Types.Int
    @ [Main.X]
    (\ (x :: GHC.Types.Int) ->
       GHC.List.take
         @ Main.X
         x
         (GHC.List.cycle
            @ Main.X
            (GHC.Types.:
               @ Main.X
               Main.Y
               (GHC.Types.: @ Main.X Main.Z (GHC.Types.[] @ Main.X)))))
    (GHC.Enum.enumFrom
       @ GHC.Types.Int GHC.Enum.$fEnumInt (GHC.Types.I# 0))
Main.main :: forall t. t
[LclIdX, Str=DmdType]
Main.main = GHC.Err.undefined
$cshowsPrec :: GHC.Types.Int -> Main.X -> GHC.Show.ShowS
[LclId, Str=DmdType]
$cshowsPrec =
  \ _ [Occ=Dead] (ds_d1gG :: Main.X) ->
    case ds_d1gG of _ [Occ=Dead] {
      Main.Y ->
        GHC.Show.showString
          (GHC.Types.:
             @ GHC.Types.Char
             (GHC.Types.C# 'Y')
             (GHC.Types.[] @ GHC.Types.Char));
      Main.Z ->
        GHC.Show.showString
          (GHC.Types.:
             @ GHC.Types.Char
             (GHC.Types.C# 'Z')
             (GHC.Types.[] @ GHC.Types.Char))
    }
Main.$fShowX [InlPrag=[ALWAYS] CONLIKE] :: GHC.Show.Show Main.X
[LclIdX[DFunId],
 Str=DmdType,
 Unf=DFun: \ ->
       GHC.Show.D:Show TYPE Main.X $cshowsPrec $cshow $cshowList]
Main.$fShowX =
  GHC.Show.D:Show @ Main.X $cshowsPrec $cshow $cshowList;
$cshowList [Occ=LoopBreaker] :: [Main.X] -> GHC.Show.ShowS
[LclId, Str=DmdType]
$cshowList =
  GHC.Show.showList__
    @ Main.X
    (GHC.Show.showsPrec @ Main.X Main.$fShowX (GHC.Types.I# 0));
$cshow [Occ=LoopBreaker] :: Main.X -> GHC.Base.String
[LclId, Str=DmdType]
$cshow = GHC.Show.$dmshow @ Main.X Main.$fShowX;
$c== :: Main.X -> Main.X -> GHC.Types.Bool
[LclId, Str=DmdType]
$c== =
  \ (ds_d1gB :: Main.X) (ds_d1gC :: Main.X) ->
    let {
      fail_d1gD :: GHC.Prim.Void# -> GHC.Types.Bool
      [LclId, Str=DmdType]
      fail_d1gD = \ _ [Occ=Dead, OS=OneShot] -> GHC.Types.False } in
    case ds_d1gB of _ [Occ=Dead] {
      Main.Y ->
        case ds_d1gC of _ [Occ=Dead] {
          __DEFAULT -> fail_d1gD GHC.Prim.void#;
          Main.Y -> GHC.Types.True
        };
      Main.Z ->
        case ds_d1gC of _ [Occ=Dead] {
          __DEFAULT -> fail_d1gD GHC.Prim.void#;
          Main.Z -> GHC.Types.True
        }
    }
Main.$fEqX [InlPrag=[ALWAYS] CONLIKE] :: GHC.Classes.Eq Main.X
[LclIdX[DFunId],
 Str=DmdType,
 Unf=DFun: \ -> GHC.Classes.D:Eq TYPE Main.X $c== $c/=]
Main.$fEqX = GHC.Classes.D:Eq @ Main.X $c== $c/=;
$c/= [Occ=LoopBreaker] :: Main.X -> Main.X -> GHC.Types.Bool
[LclId, Str=DmdType]
$c/= =
  \ (a :: Main.X) (b :: Main.X) ->
    GHC.Classes.not (GHC.Classes.== @ Main.X Main.$fEqX a b);
:Main.main :: GHC.Types.IO GHC.Prim.Any
[LclIdX, Str=DmdType]
:Main.main =
  GHC.TopHandler.runMainIO
    @ GHC.Prim.Any (Main.main @ (GHC.Types.IO GHC.Prim.Any))
Mitch answered 3/2, 2015 at 22:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.