Learnings from haskell in production
Sreenidhi Nair
sreenidhi@byteally.com
# Haskell at ByteAlly
* Haskell from the beginning, for last 8 years.
* FMap - A transpiler to convert Haskell EDSL to mobile platforms generating human readable code.
* FRF - Workflow library that is easy to define & visualize business processes.
* TypeQL - A query language for the internet.
Today: we're going to focus on TypeQL
-----
# TypeQL
* Typed query language.
* Intended for stitching data from different sources and processing it.
* SQL
* web apis
* CSV
* GPU computations.
* Being used currently in internal projects.
-----
# Actions
## TypeQL
usrs = Read DB User {}
## Haskell
type Usrs = Read DB User '[]
* Translated to haskell type level code to
1. Avoid writing a custom type checker.
2. Ensure everything is statically checked.
-----
# Combinators
## TypeQL
usrCourses = Product { usrs = Read DB User {}
, courses = Api Course {}
} { }
## Haskell
type UsrCourses =
Product '[ "usrs" := Read DB User '[]
, "courses" := Api Course '[]
] '[ ]
-----
# Filters
## TypeQL
usrsAbove20 = Read DB User { FilterBy age
:using-hs `>= 20`
}
## Haskell
type Usrs = Read DB User '[ FilterBy '[S "age"]
`Using` >= 20
]
- `:using-{lang}`. Currently haskell is supported, close to adding js.
-----
# Validations
Powered by reifying and analyzing queries
## Invalid targetting
usrs = Read DB User { FilterBy time }
## Filter type mismatch
usrs = Read DB User { Avg name }
## Input type mismatch
usrsAbove20 = Read DB User { FilterBy age
:using-hs `+ 1` }
-----
# Reifying query
- Generated type query is converted into a richer (type level) AST called tree.
- Validations performed on the reified tree.
- The reification is done with (closed) type families.
type family F a where
F Int = Int
F a = TypeError ('Text "It isn't an Int")
- Custom type errors for emitting domain specific errors.
- The reification is also used for editor completions.
-----
# Extensibility
* Actions
- SQL, WebApi, CSV, GPU
* Combinators
- Joins, And, Or
* Filters
- FilterBy, GroupBy, Avg, Sum
-----
# Modelling ADTs
Avoid boilerplate of writing different types differing by a few types. With boilerplate:
newtype Age = Age { getAge :: Natural }
newtype Department = Dept { getDept :: Text }
data User = User { name :: Text
, age :: Age
, dept :: Department
}
data UserAvgAge = UserAvgAge { name :: Text
, avgAge :: Double
, dept :: Department
}
-----
# Modelling ADTs (2)
HList representation, no boilerplate:
data HList xs where
(:>) :: x -> HList xs -> HList (fld ::: x ': xs)
HNil :: HList '[]
type User = HList '[ "name" ::: Identity Text
, "age" ::: Identity Age
, "dept" ::: Identity Department
]
type UserAvgAge =
HList '[ "name" ::: Identity Text
, "age" ::: Identity Double
, "dept" ::: Identity Department
]
-----
# Filter processing pipeline
Two types of filters
- Transformational filters
- Row filters
- Implemented via standard haskell abstractions
-----
# Transformational filters
Transforms a particular field
usrs = Read DB User { Map "age" :using-hs (+ 1) }
name age department
[ "person1", 20 , IT
, "person2", 20 , IT
]
=>
[ "person1", 21 , IT
, "person2", 21 , IT
]
-----
# Row filters
Update on the rows
usrAvgAge = Read DB User { Avg "age", GroupBy "dept"
, Exclude "name" }
name age department
[ "person1", 20 , IT
, "person2", 20 , IT
, "person3", 25 , Sales
]
=>
age department
[ 20.0 , IT
, 25.0 , Sales
]
-----
# Concurrent by default
Actions run concurrently
usrAndProfs =
InnerJoin { usrs = Read DB User {}
, profs = Api Profile {}
} { input joinOn
:with { usrs.id
, profs.userId
}
:using-hs `(==)`
}
Thanks to Haskell's lightweight green threads
-----
# Concurrent by default (2)
* Haxl is a Haskell library that simplifies access to remote data, such as databases or web-based services.
* Efficient scheduling of concurrent data accesses
{-# LANGUAGE ApplicativeDo #-}
usrAndProfs = do
usrs <- dataFetch (Read :: Read DB User)
apis <- dataFetch (Api :: Api Profile )
innerJoin usrs apis
-----
# Downsides
- Compilation time / memory consumed with a lot of type level operations is often high.
- Error messages, if not handled by custom type errors, can be often
very cryptic.
- Type level code is syntactically and semantically different from normal haskell value code.
- It is often hard to hire Haskell developers.
-----
# Takeaways
- Prevent bugs at compile time.
- Avoid boilerplate of having different types.
- Queries run concurrently - thanks to excellent concurrency story of Haskell.
- Types can be used to encode domain specific information, and compiler can be used for analysis.
-----
# Questions ?
-----
# Thank You
-----
# Addendum
-----
# Row filters
- Profunctors to capture the operation
class Profunctor p where
dimap :: (b -> a) -> (c -> d) -> p a c -> p b d
- (->), Star, Fold from foldl are all profunctors.
usrAvgAge = Read DB User { FilterBy age
:using-hs (> 20) }
-- data Star f a b = Star { runStar :: a -> f b }
-- Star Maybe models filtering
name => Star $ \x -> Just x
age => Star $ \x -> if x > 20
then Just x
else Nothing
dept => Star $ \x -> Just x
-----
# Row filters (2)
- stitch profunctors together by product profunctor
name :: Star Maybe Text Text
age :: Star Maybe Age Age
dept :: Star Maybe Department Department
Star Maybe
(HList '["name" ::: Text
, "age" ::: Age
, "dept" ::: Department])
(HList '["name" ::: Text
, "age" ::: Age
, "dept" ::: Department])
class Profunctor p => ProductProfunctor p where
purePP :: b -> p a b
(****) :: p a (b -> c) -> p a b -> p a c
-----
# Row filters (3)
- proMap to execute the pipeline
class (Profunctor pro
) => ProMapping pro f' f | pro f' -> f where
proMap :: pro x' x -> f' x' -> f x
-----