This module represents part of the bound names environment used in the type checker. It doesn't cover the stuff that is contained in the catalog (so it is slightly misnamed), but focuses only on identifiers introduced by things like tablerefs, sub selects, plpgsql parameters and variables, etc.
> {-# LANGUAGE DeriveDataTypeable,TupleSections,ScopedTypeVariables,OverloadedStrings #-}
> module Database.HsSqlPpp.Internals.TypeChecking.Environment
>     (-- * abstract environment value
>      Environment
>     ,JoinType(..)
>      -- * environment create and update functions
>     ,emptyEnvironment
>     ,isEmptyEnv
>     ,envCreateTrefEnvironment
>     ,createJoinTrefEnvironment
>     ,envSelectListEnvironment
>     ,createCorrelatedSubqueryEnvironment
>     ,createTrefAliasedEnvironment
>     ,brokeEnvironment
>     ,orderByEnvironment
>      -- * environment query functions
>     ,envLookupIdentifier
>     ,envExpandStar
>     ) where
> import Data.Data
> import Data.Char
> --import Data.Maybe
> import Control.Monad
> import Control.Arrow
> import Data.List
> --import Debug.Trace
> --import Text.Show.Pretty
> import Database.HsSqlPpp.Internals.TypesInternal
> import Database.HsSqlPpp.Internals.TypeChecking.TypeConversion.TypeConversion
> import Database.HsSqlPpp.Internals.Catalog.CatalogInternal
> import Database.HsSqlPpp.Internals.Catalog.CatalogTypes hiding (ncStr)
> import Data.Generics.Uniplate.Data
> import Data.Text (Text)
> import qualified Data.Text as T
--------------------------------- Alex: Not sure that handling of USING joins (and specifically of join keys) is correct, but don't know, have to investigate all the connections in the code. One problem is that types for the purpose of comparison are different from the output types. Moreover, major DBs are imprecise in their definitions of USING joins. For instance, what Postgre writes about output columns seems to be related only to the star expansion. I still can explicitly reference a key column from the inner table in the output expression, with the use of table name or alias. Or can I not? Have to look at ANSI. More precisely, we have: 1. Type check join keys to see that they are compatible. 2. If needed, create additional columns that would exist for the purpose of comparison only. The scalar expressions that create them are casts from join keys. 3. If a join key is used in an output scalar expression, the original' key type is used. For a qualified column reference, including qualified star expansion, original' type is same as original type, except that it is possibly cast to nullable for an inner table of an outer join. For star expansion: The original' type is defined strictly for LeftOuter and RightOuter. It is unclear what to do in case of Inner and FullOuter if the types differ (but, as we already know, are compatible). There are two choices: 1. Reject the query. 2. Resolve the types and return the result type. For now, I'll do the 2nd thing, let's see what happens. And I don't implement the rest of this plan either, of course: - I convert things to nullable for outer joins, but don't separate between condition columns and output columns, of course; - For Right Outer Joins, I reverse the order of input environments in listBindingsTypes, for the purpose of getting the key types. And now I see that TypeExtra's in join ids are not actually used.
> -- | Represent an environment using an abstracted version of the syntax
> -- which produced the environment. This structure has all the catalog
> -- queries resolved. No attempt is made to combine environment parts from
> -- different sources, they are just stacked together, the logic for
> -- working with combined environments is in the query functions below
> data Environment =
>                  -- | represents an empty environment, makes e.g. joining
>                  -- the environments for a list of trefs in a select list
>                  -- more straightforward
>                    EmptyEnvironment
>                  -- | represents the bindings introduced by a tableref:
>                  -- the name, the public fields, the private fields
>                  | SimpleTref (Text,Text) [(Text,TypeExtra)] [(Text,TypeExtra)]
>                  -- | environment from joining two tables
>                  | JoinTref [(Text,TypeExtra)] -- join ids
>                             JoinType  -- added because outer join makes some things nullabie
>                             Environment Environment
>                  -- | environment from a sub select
>                  | SelectListEnv [(Text,TypeExtra)]
>                    -- | correlated subquery environment
>                  | CSQEnv Environment -- outerenv
>                           Environment -- main env
>                    -- | an aliased tref
>                  | TrefAlias Text (Maybe [Text]) Environment
>                  | BrokeEnvironment
>                    -- order by: can use the name of a select list column
>                    -- or anything from the same environment which select
>                    -- list operates on
>                  | OrderByEnvironment Environment Environment
>                    deriving (Data,Typeable,Show,Eq)
>
> data JoinType = Inner | LeftOuter | RightOuter | FullOuter
>                    deriving (Data,Typeable,Show,Eq)
--------------------------------------------------- Create/ update functions, these are shortcuts to create environment variables, the main purpose is to encapsulate looking up information in the catalog and combining environment values with updates TODO: remove the create prefixes
> emptyEnvironment :: Environment
> emptyEnvironment = EmptyEnvironment
> isEmptyEnv :: Environment -> Bool
> isEmptyEnv EmptyEnvironment = True
> isEmptyEnv _ = False
> envCreateTrefEnvironment :: Catalog -> [NameComponent] -> Either [TypeError] Environment
> envCreateTrefEnvironment cat tbnm = do
>   (nm,pub,prv) <- catLookupTableAndAttrs cat tbnm
>   return $ SimpleTref nm pub (second mkTypeExtraNN `map` prv)
> envSelectListEnvironment :: [(Text,TypeExtra)] -> Either [TypeError] Environment
> envSelectListEnvironment cols =
>   return $ SelectListEnv $ map (first $ T.map toLower) cols
> -- | create an environment as two envs joined together
> createJoinTrefEnvironment :: Catalog
>                           -> Environment
>                           -> Environment
>                           -> JoinType
>                           -> Maybe [NameComponent] -- join ids: empty if cross join
>                                                    -- nothing for natural join
>                           -> Either [TypeError] Environment
> createJoinTrefEnvironment cat tref0 tref1 jt jsc = do
>   -- todo: handle natural join case
>   (jids::[Text]) <- case jsc of
>             Nothing -> do
>                        j0 <- fmap (map (snd . fst)) $ envExpandStar Nothing tref0
>                        j1 <- fmap (map (snd . fst)) $ envExpandStar Nothing tref1
>                        return $ j0 `intersect` j1
>             Just x -> return $ map nmcString x
>  --         maybe (error "natural join ids") (map (nnm . (:[]))) jsc
>   jts <- forM jids $ \i -> do
>            (_,t0) <- envLookupIdentifier [QNmc $ T.unpack i] tref0
>            (_,t1) <- envLookupIdentifier [QNmc $ T.unpack i] tref1
>            let  adjustTypeExtra te = case jt of
>                     Inner -> te
>                     LeftOuter -> t0
>                     RightOuter -> t1
>                     FullOuter -> mkNullable te
>            fmap ((i,) . adjustTypeExtra) $ resolveResultSetTypeExtra cat [t0,t1]
>            -- ImplicitCastToDo: remove or restore
>            --fmap (i,) $ resolveResultSetTypeExtra cat [t0,t1]
>   -- todo: check type compatibility
>   return $ JoinTref jts jt tref0 tref1
> createCorrelatedSubqueryEnvironment :: Environment -> Environment -> Environment
> createCorrelatedSubqueryEnvironment = CSQEnv
> createTrefAliasedEnvironment :: Text -> Maybe [Text] -> Environment -> Environment
> createTrefAliasedEnvironment = TrefAlias
> -- | represents type check failure upstream, don't produce additional
> -- type check errors
> brokeEnvironment :: Environment
> brokeEnvironment = BrokeEnvironment
> isBroken :: Environment -> Bool
> isBroken env = not $ null [() | BrokeEnvironment <- universeBi env]
> orderByEnvironment :: Environment -> Environment -> Environment
> orderByEnvironment = OrderByEnvironment
------------------------------------------------------- The main hard work is done in the query functions: so the idea is that the update functions create environment values which contain the context free contributions of each part of the ast to the current environment, and these query functions do all the work of resolving implicit correlation names, ambigous identifiers, etc. for each environment type, provide two functions which do identifier lookup and star expansion
> listBindingsTypes :: Environment -> ((Maybe Text,Text) -> [((Text,Text),TypeExtra)]
>                                     ,Maybe Text -> [((Text,Text),TypeExtra)] -- star expand
>                                     )
> listBindingsTypes EmptyEnvironment = (const [],const [])
> listBindingsTypes BrokeEnvironment = (const [],const [])
> listBindingsTypes (TrefAlias ta Nothing env) =
>   (\(q,n) -> if q `elem` [Nothing, Just ta]
>              then req $ fst (listBindingsTypes env) (Nothing,n)
>              else []
>   ,\q -> if q `elem` [Nothing, Just ta]
>          then req $ snd (listBindingsTypes env) Nothing
>          else [])
>   where
>     req = map (\((_,i),t) -> ((ta,i),t))
> listBindingsTypes (TrefAlias ta (Just cs) env) =
>   (\(q,n) -> --trace ("lookup: " ++ show (q,n)) $
>      if q `elem` [Nothing, Just ta]
>      then    --really hacky, assume the ids come out of the star expansion in same order
>              -- almost certainly wrong some of the time
>              case elemIndex n cs of
>                Just i -> let s :: [((Text, Text), TypeExtra)]
>                              s = (snd (listBindingsTypes env) Nothing)
>                          in {-trace ("getit : " ++ show (i,show s))
>                                      $ -}
>                             -- map to change the qualifier name to match
>                             -- this alias not the source tref
>                             map (\((_,_j),t) -> ((ta,n),t)) $ take 1 $ drop i s
>                Nothing -> []
>      else []
>   ,\q -> if q `elem` [Nothing, Just ta]
>          then let -- if there are too many aliases for the aliased tref
>                   -- the extras are ignored (not sure if this is correct)
>                   -- if there are not enough, the extras are kept without
>                   -- being renamed (think this is correct)
>                   repColNames = map Just cs ++ repeat Nothing
>                   aliasize :: [((Text, Text), TypeExtra)] -> [((Text, Text), TypeExtra)]
>                   aliasize =
>                     flip zipWith repColNames (\r ((_,n),t) ->
>                              case r of
>                                Just r' -> ((ta,r'),t)
>                                Nothing -> ((ta,n),t))
>               in aliasize $ snd (listBindingsTypes env) Nothing
>          else [])
>   where
>     -- not sure why this is here, code layout is a bit confusing
>     _req = map (\((_,i),t) -> ((ta,i),t))
FIXME!!! (_,nm) ?
> listBindingsTypes (SimpleTref (_,nm) pus pvs) =
>   (\(q,n) -> let m (n',_) = (q `elem` [Nothing,Just nm])
>                             && n == n'
>              in addQual nm $ filter m $ pus ++ pvs
>   ,\q -> case () of
>            _ | q `elem` [Nothing, Just nm] -> addQual nm pus
>              | otherwise -> [])
> listBindingsTypes (JoinTref jids jt env0 env1) =
>   (idens,starexp)
>   where
>     idens k = let [iOuter,iInner] = (if jt==RightOuter then reverse else id) [is0 k, is1 k]
>               in if not (null iOuter) && snd k `elem` jnames
>                  then iOuter
>                  else iOuter ++ iInner
>     _useResolvedType tr@((q,n),_) = case lookup n jids of
>                                    Just t' -> ((q,n),t')
>                                    Nothing -> tr
>     jnames = map fst jids
>     isJ ((_,n),_) = n `elem` jnames
todo: use useResolvedType unqualified star: reorder the ids so that the join columns are first
>     starexp Nothing = let (aj,anj) = partition isJ (st0 Nothing)
>                           bnj = filter (not . isJ) (st1 Nothing)
>                       in aj ++ anj ++ bnj
>     starexp q@(Just _) =
>       let s0 = st0 q
>           s1 = st1 q
>       in case (s0,s1) of
>            -- if we only get ids from one side, then don't
>            -- reorder them (is this right?)
>            (_:_,[]) -> s0
>            ([], _:_) -> s1
>            -- have ids coming from both sides
>            -- no idea how this is supposed to work
>            _ -> let (aj,anj) = partition isJ s0
>                     bnj = filter (not . isJ) s1
>                 in aj ++ anj ++ bnj
>     (is0,st0) = (if jt `elem` [RightOuter,FullOuter] then addNullability else id)
>                 $ listBindingsTypes env0
>     (is1,st1) = (if jt `elem` [LeftOuter,FullOuter] then addNullability else id)
>                 $ listBindingsTypes env1
>     addNullability = (map (second mkNullable) .) *** (map (second mkNullable) .)
selectlistenv: not quite right, but should always have an alias so the empty qualifier never gets very far
> listBindingsTypes (SelectListEnv is) =
>   (\(_,n) -> addQual "" $ filter ((==n).fst) is
>   ,const $ addQual "" is)
not quite right, see queryexprs.ag
> listBindingsTypes (OrderByEnvironment sl tr) =
>   (\i ->
>      -- hack: return the tref first so that
>      -- a qualifier can be added. This
>      -- is probably more wrong than the other
>      -- way round
>      case (fst (listBindingsTypes tr) i
>           ,fst (listBindingsTypes sl) i) of
>        ([],x) -> x
>        (y,_) -> y
>   ,const [])
csq just uses standard shadowing for iden lookup for star expand, the outer env is ignored
> listBindingsTypes (CSQEnv outerenv env) =
>   (\k -> case (fst (listBindingsTypes env) k
>               ,fst (listBindingsTypes outerenv) k) of
>            (x,_) | not (null x) -> x
>            (_, x) | not (null x)  -> x
>            _ -> []
>   ,snd $ listBindingsTypes env)
> addQual :: Text -> [(Text,TypeExtra)] -> [((Text,Text),TypeExtra)]
> addQual q = map (\(n,t) -> ((q,n),t))
------------------------------------------------------- use listBindingsTypes to implement expandstar and lookupid
> envExpandStar:: Maybe NameComponent -> Environment
>                 -> Either [TypeError] [((Text,Text),TypeExtra)]
> envExpandStar {-nmc env-} = {-let r =-} envExpandStar2 {-nmc env-}
>                         {-in trace ("env expand star: " ++ show nmc ++ " " ++ show r)
>                            r-}
> envExpandStar2 :: Maybe NameComponent -> Environment -> Either [TypeError] [((Text,Text),TypeExtra)]
> envExpandStar2 nmc env =
>   if isBroken env
>   then Left []
>   else
>     let st = snd (listBindingsTypes env) $ fmap nmcString nmc
>     in if null st
>        then case nmc of
>               Just x -> Left [UnrecognisedCorrelationName $ nmcString x]
>               Nothing -> Left [BadStarExpand]
>        else Right st
> nmcString :: NameComponent -> Text
> nmcString (QNmc n) = T.pack n
> nmcString (Nmc n) = T.pack $ map toLower n
> -- todo: don't use error
> nmcString (AntiNameComponent _) = error "tried to get ncstr of antinamecomponent"
> envLookupIdentifier :: [NameComponent] -> Environment
>                      -> Either [TypeError] ((Text,Text), TypeExtra)
> envLookupIdentifier nmc env = --trace ("lookup: " ++ show nmc  ++ "\n" ++ ppShow env) $
>   if isBroken env
>   then Left []
>   else do
>     k <- case nmc of
>                [a,b] -> Right (Just $ nmcString a, nmcString b)
>                [b] -> Right (Nothing, nmcString b)
>                [_,_,_] -> Left [SchemadColumnName "an identifier cannot be used with an explicit schema name, please use only a correlation name without a schema name (you can use a table reference alias to disambiguate if you need to)."]
>                [_,_,_,_] -> Left [DbSchemadColumnName "an identifier cannot be used with an explicit database name and schema name, please use only a correlation name without a schema name (you can use a table reference alias to disambiguate if you need to)."]
>                _ -> Left [InternalError "too many nmc components in envlookupiden"]
>     case (fst $ listBindingsTypes env) k of
>       [] -> Left [UnrecognisedIdentifier $ nmcString $ last nmc]
>       [x] -> Right $ keepCasehack x
>       _ -> Left [AmbiguousIdentifier $ nmcString $ last nmc]
>   where
>     keepCasehack ((na,nb),t) =
>       case nmc of
>         [a,b] -> let x = ((keepcase a na,keepcase b nb),t)
>                  in {-(if True -- map toLower nb == "ou_id"
>                      then trace ("\n\n*********************\n\nlookup: " ++ show x ++ "\n\n********************************\n\n" ++ ppShow env ++ "\n\n********************************\n\n")
>                      else id)-} x
>         [b] -> ((na,keepcase b nb),t)
>         _ -> error "too many nmc components in envlookupiden(2)"
>     keepcase orig new = -- sanity check: make sure the nmcs are equal
>                         if T.map toLower new == nmcString orig
>                         then noLower orig
>                         else new
>     noLower (QNmc n) = T.pack n
>     noLower (Nmc n) = T.pack n
>     noLower (AntiNameComponent n) = error $ "bad antinamecomponent in Environment.envLookupIdentifier.noLower " ++ n
-------------------------- adding for plpgsql notes: additional envs * parameter in function * declaration in function block * implicit integer loop var in for loop * set explicit record type in for loop/ assignment to record type * for constraints in create table, create domain Write tests to quickly check each bit of code which uses these using the full typechecking: update: sets, where, returning select: tref -> select list, where, group by, order by join: out to tref, into on expression implicit variable in for loop record type in for loop record type in assignment record type in select into delete where and returning block declarations constraints in create table, create domain parameters in function body statementlist: pass on record updates? insert: columns?, returning