contextual

simple and typesafe interpolated strings, checked at compile-time

Getting started

SBT

"com.propensive" %% "contextual" % "1.0.0"

Import

import contextual._

Examples

Links

About contextual

Contextual is a small Scala library for defining your own string interpolators—prefixed string literals like url"https://google.com" which determine how they are interpreted at compile-time, including any custom checks and compile errors that should be reported, while only writing very ordinary "user" code: no macros!

A simple example

We can define a simple interpolator for URLs like this:

import contextual._

case class Url(url: String)

object UrlInterpolator extends Interpolator {
  
  def contextualize(interpolation: StaticInterpolation) = {
    val lit@Literal(_, urlString) = interpolation.parts.head
    if(!checkValidUrl(urlString))
      interpolation.abort(lit, 0, "not a valid URL")

    Nil
  }

  def evaluate(interpolation: RuntimeInterpolation): Url =
    Url(interpolation.literals.head)
}

implicit class UrlStringContext(sc: StringContext) {
  val url = Prefix(UrlInterpolator, sc)
}

and at the use site, it makes this possible:

scala> url"http://www.propensive.com/"
res: Url = Url(http://www.propensive.com/)

scala> url"foobar"
<console>: error: not a valid URL
       url"foobar"
           ^

How does it work?

Scala offers the facility to implement custom string interpolators, and while these may be implemented with a simple method definition, the compiler imposes no restrictions on using macros. This allows the constant parts of an interpolated string to be inspected at compile-time, along with the types of the expressions substituted into it.

Contextual provides a generalized macro for interpolating strings (with a prefix of your choice) that calls into a simple API for defining the compile-time checks and runtime implementation of the interpolated string.

This can be done without you writing any macro code.

Concepts

Interpolators

An Interpolator defines how an interpolated string should be understood, both at compile-time, and runtime. Often, these are similar operations, as both will work on the same sequence of constant literal parts to the interpolated string, but will differ in how much is known about the holes; that is, the expressions being interpolated amongst the constant parts of the interpolated string. At runtime we have the evaluated substituted values available, whereas at compile-time the values are unknown, though we do have access to certain meta-information about the substitutions, which allows some useful constraints to be placed on substitutions.

The contextualize method

Interpolators have one abstract method which needs implementing to provide any compile-time checking or parsing functionality:

def contextualize(interpolation: StaticInterpolation): Seq[Context]

The contextualize method requires an implementation which inspects the literal parts and holes of the interpolated string. These are provided by the parts member of the interpolation parameter. interpolation is an instance of StaticInterpolation, and also provides methods for reporting errors and warnings at compile-time.

The evaluate method

The runtime implementation of the interpolator would typically be provided by defining an implementation of evaluate. This method is not part of the subtyping API, so does not have to conform to an exact shape; it will be called with a single Contextual[RuntimePart] parameter whenever an interpolator is expanded, but may take type parameters or implicit parameters (as long as these can be inferred), and may return a value of any type.

The StaticInterpolation and RuntimeInterpolation types

We represent the information about the interpolated string known at compile-time and runtime with the StaticInterpolation and RuntimeInterpolation types, respectively. These provide access to the constant literal parts of the interpolated string, metadata about the holes and the means to report errors and warnings at compile-time; and at runtime, the values substituted into the interpolated string, converted into a common "input" type. Normally String would be chosen for the input type, but it's not required.

Perhaps the most useful method of the interpolation types is the parts method which gives the sequence of parts representing each section of the interpolated string: alternating Literal values with either Holes (at compile-time) or Substitutions at runtime.

Contexts

When checking an interpolated string containing some DSL, holes may appear in different contexts within the string. For example, in a XML interpolated string, a substitution may be inside a pair of (matching) tags, or as a parameter to an attribute, for example, xml"<tag attribute=$att>$content</tag>". In order for the XML to be valid, the string att must be delimited by quotes, whereas the string code does not require the quotes; both will require escaping. This difference is modeled with the concept of Contexts: user-defined objects which represent the position within a parsed interpolated string where a hole is, and which may be used to distinguish between alternative ways of making a substitution.

This idea is fundamental to any advanced implementation of the contextualize method: besides performing compile-time checks, the method should return a sequence of Contexts corresponding to each hole in the interpolated string. In the XML example above, this might be the sequence, Seq(Attribute, Inline), referencing objects (defined at the same time as the Interpolator) which provide context to the substitutions of the att and contentvalues.

Generalizing Substitutions

A typical interpolator will allow only certain types to be used as substitutions. This may include a few common types like Ints, Booleans and Strings, but Contextual supports ad-hoc extension with typeclasses, making it possible for user-defined types to be supported as substitutions, too. However, in order for the interpolator to understand how to work with arbitrary types, which may not yet have been defined, the interpolator must agree on a common interface for all substitutions. This is the Input type, defined on the Interpolator, and every typeclass instance representing how a particular type should be embedded in an interpolated string must define how that type is converted to the common Input type.

Often, it is easy and sufficient to use String as the Input type.

Embedding types

Different types are embedded by defining an implicit Embedder typeclass instance, which specifies with a number of Case instances how the type should be converted to the interpolator's Input type. For example, given a hypothetical XML interpolator, Symbols could be embedded using,

implicit val embedSymbolsInXml = XmlInterpolator.embed[Symbol](
  Case(AttributeKey, AfterAtt)(_.name),
  Case(AttributeVal, InTag) { s => '"'s.name+'"' },
  Case(Content, Content)(_.name)
)

where the conversion to Strings are defined for three different contexts, AttributeKey, AttributeVal, and Content. Whilst in the first two cases, the context changes, in the final case, the context is unchanged by making the substitution.

Attaching the interpolator to a prefix

Finally, in order to make a new string interpolator available through a prefix on a string, the Scala compiler needs to be able to "see" that prefix on Scala's built-in StringContext object. This is very easily done by specifying a new Prefix value with the desired name on an implicit class that wraps StringContext, as in the example above:

implicit class UrlStringContext(sc: StringContext) {
  val url = Prefix(UrlInterpolator, sc)
}

The Prefix constructor takes only two parameters: the Interpolator object (and it must be an object, otherwise the macro will not be able to invoke it at compile time), and the StringContext instance that we are extending.