Datalake Authorizer

The datalake authorizer is a custom authentication solution based on Lambda and DynamoDB. It uses JWT and supports both user and machine-to-machine flows.

Registering applications

For machine-to-machine setups, first a set of client credentials must be added to the datalake-authorizer-{ENVIRONMENT}-client-credentials-table Dynamo table. The app_id is an arbitrary username selected for your given application, and app_secret is the sha256 of the arbitrary password selected for said application.

Next, the allowed routes must be added to the datalake-admin-{ENVIRONMENT}-access-table Dynamo table. If you are dealing with an existing gateway, the structure is as follows:

PK: <app_id>
SK: <api_name>/<route>
GSI1PK: <api_name>
GSI1SK: <app_id>

If the gateway does not exist, it can be added to the same table with this structure:

PK: <api_name>
SK: api_definition
api_id: <api_id>

Note: The SK is the literal api_definition string. The api_id can be found on the API Gateway page in the AWS console for that gateway.

IaC

To apply the authorizer to your own APIs, the following examples will cover the cases where your architecture is defined by Cloudformation or SAM templates.

Cloudformation

Cloudformation is the more verbose example. First, add an authorizer resource with this format:

YourProjectLambdaAuthorizer:
    Type: 'AWS::ApiGateway::Authorizer'
    Properties:
        RestApiId: !Ref YourProjectApi
        Type: "TOKEN"
        AuthorizerUri: !Sub
        "arn:aws:apigateway:eu-west-1:lambda:path/2015-03-31/functions/arn:aws:lambda
        :eu-west-1:${AWS::AccountId}:function:datalake-authorizer-${Environment}-LambdaAuthorizerApiGateway/invocations"
        IdentitySource: "method.request.header.x-api-key"
        Name: !Sub "${Project}-${Environment}-api-lambda-custom-authorizer"
        AuthorizerResultTtlInSeconds: 0

Following this, in your AWS::ApiGateway::Method resources you will have to split off the OPTIONS method from any others. This is because OPTIONS calls do not pass authentication headers, and will be refused by the lambda.

This would be a method calling a lambda with any method:

GetResourceANY:
    Type: 'AWS::ApiGateway::Method'
    Properties:
        RestApiId: !Ref YourProjectApi
        ResourceId: !Ref GetResource
        HttpMethod: ANY
        AuthorizationType: CUSTOM
        AuthorizerId: !Ref YourProjectApiLambdaAuthorizer
        Integration:
            Type: AWS_PROXY
            IntegrationHttpMethod: POST
            Uri: !Sub >-
            arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${Get.Arn}/invocations

And this is the split off OPTIONS method:

GetResourceCORS:
    Type: 'AWS::ApiGateway::Method'
    DependsOn:
    - ProxyResource
    - SEAdminLambdaAuthorizer
    Properties:
        RestApiId: !Ref YourProjectApi
        ResourceId: !Ref GetResource
        HttpMethod: OPTIONS
        AuthorizationType: NONE
        MethodResponses:
            - StatusCode: 200
            ResponseModels:
                application/json: 'Empty'
            ResponseParameters:
                method.response.header.Access-Control-Allow-Headers: falsemethod.response.header.Access-Control-Allow-Methods: false
                method.response.header.Access-Control-Allow-Origin: false
        Integration:
            Type: MOCK
        IntegrationResponses:
            - StatusCode: 200
        ResponseParameters:
            method.response.header.Access-Control-Allow-Headers:
                "'Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token'"
            method.response.header.Access-Control-Allow-Methods: "'*'"
            method.response.header.Access-Control-Allow-Origin: "'*'"
        RequestTemplates:
            application/json: '{"statusCode": 200}'

SAM

SAM (Serverless Application Model) is a much more concise method of defining resources. In this case, simply add the Auth object to your existing AWS::Serverless::Api like so:

YourProjectApi:
    Type: AWS::Serverless::Api
    Properties:
        Name: '****"
        StageName: !Ref Environment
        Auth:
            Authorizers:
                LambdaAuthorizerApiGateway:
                    FunctionArn: !Sub
                    "arn:aws:lambda:eu-west-1:${AWS::AccountId}:function:datalake-authorizer-${Environment}-LambdaAuthorizerApiGateway"
                    Identity:
                        Header: x-api-key
            ApiKeyRequired: false

Then do the same for any methods you wish to have use the authorizer, as follows:

YourLambda:
    Type: AWS::Serverless::Function
    Properties:
        FunctionName: '***'
        Events:
            ApiEventLambda:
                Type: Api
                Properties:
                    Path: /your-path
                    Method: post
                    RestApiId: !Ref YourProjectApi
                    Auth:
                        Authorizer: LambdaAuthorizerApiGateway

Calling

To use methods protected by the authorizer, you must first generate tokens based on the client credentials we created in step 1. To do so, make a POST call to https://auth.{ENVIRONMENT}.datalake-authorizer.datalake.systems/auth, with this body:

{
    app_id: <app_id>,
    app_secret: <app_secret>
}

Don't forget to send the app_secret in its original form, and not the sha256 value you added to the database. If everything goes well, you will receive an access_token and a refresh_token (which are functionally identical in the context of machine-to-machine flows).

Finally, you may make calls to the authenticated APIs by passing your access_token in the x-api-key header.