Copyright | (c) 2009, 2010 Bryan O'Sullivan, (c) 2009 Simon Marlow |
---|---|
License | BSD-style |
Maintainer | bos@serpentine.com |
Stability | experimental |
Portability | GHC |
Safe Haskell | Trustworthy |
Language | Haskell98 |
Efficient locale-sensitive support for text I/O.
Skip past the synopsis for some important notes on performance and portability across different versions of GHC.
- readFile :: FilePath -> IO Text
- writeFile :: FilePath -> Text -> IO ()
- appendFile :: FilePath -> Text -> IO ()
- hGetContents :: Handle -> IO Text
- hGetChunk :: Handle -> IO Text
- hGetLine :: Handle -> IO Text
- hPutStr :: Handle -> Text -> IO ()
- hPutStrLn :: Handle -> Text -> IO ()
- interact :: (Text -> Text) -> IO ()
- getContents :: IO Text
- getLine :: IO Text
- putStr :: Text -> IO ()
- putStrLn :: Text -> IO ()
Performance
The functions in this module obey the runtime system's locale, character set encoding, and line ending conversion settings.
If you know in advance that you will be working with data that has a specific encoding (e.g. UTF-8), and your application is highly performance sensitive, you may find that it is faster to perform I/O with bytestrings and to encode and decode yourself than to use the functions in this module.
Whether this will hold depends on the version of GHC you are using, the platform you are working on, the data you are working with, and the encodings you are using, so be sure to test for yourself.
Locale support
Note: The behaviour of functions in this module depends on the version of GHC you are using.
Beginning with GHC 6.12, text I/O is performed using the system or handle's current locale and line ending conventions.
Under GHC 6.10 and earlier, the system I/O libraries do not support locale-sensitive I/O or line ending conversion. On these versions of GHC, functions in this library all use UTF-8. What does this mean in practice?
- All data that is read will be decoded as UTF-8.
- Before data is written, it is first encoded as UTF-8.
- On both reading and writing, the platform's native newline conversion is performed.
If you must use a non-UTF-8 locale on an older version of GHC, you will have to perform the transcoding yourself, e.g. as follows:
import qualified Data.ByteString as B import Data.Text (Text) import Data.Text.Encoding (encodeUtf16) putStr_Utf16LE :: Text -> IO () putStr_Utf16LE t = B.putStr (encodeUtf16LE t)
readFile :: FilePath -> IO Text Source
The readFile
function reads a file and returns the contents of
the file as a string. The entire file is read strictly, as with
getContents
.
writeFile :: FilePath -> Text -> IO () Source
Write a string to a file. The file is truncated to zero length before writing begins.
appendFile :: FilePath -> Text -> IO () Source
Write a string the end of a file.
Operations on handles
hGetContents :: Handle -> IO Text Source
Read the remaining contents of a Handle
as a string. The
Handle
is closed once the contents have been read, or if an
exception is thrown.
Internally, this function reads a chunk at a time from the lower-level buffering abstraction, and concatenates the chunks into a single string once the entire file has been read.
As a result, it requires approximately twice as much memory as its result to construct its result. For files more than a half of available RAM in size, this may result in memory exhaustion.
hGetChunk :: Handle -> IO Text Source
Experimental. Read a single chunk of strict text from a
Handle
. The size of the chunk depends on the amount of input
currently buffered.
This function blocks only if there is no data available, and EOF has not yet been reached. Once EOF is reached, this function returns an empty string instead of throwing an exception.
Special cases for standard input and output
interact :: (Text -> Text) -> IO () Source
The interact
function takes a function of type Text -> Text
as its argument. The entire input from the standard input device is
passed to this function as its argument, and the resulting string
is output on the standard output device.
getContents :: IO Text Source
Read all user input on stdin
as a single string.