Gavin Gilmour

Jun 28, 2022

Bulk APOC JSON Load

For a faster bulk JSON load, avoid loading the JSON into a Cypher query directly (e.g. using WITH/UNWIND) and instead look at the apoc.load.json and apoc.perioditc.iterate APOC functions.

From APOC User Guide:-

With apoc.periodic.iterate you provide 2 statements, the first outer statement is providing a stream of values to be processed. The second, inner statement processes one element at a time or with iterateList:true the whole batch at a time.

With apoc.load.json, it’s now very easy to load JSON data from any file or URL, to avoid directly inserting the JSON into a script

First run a neo4j instance:-

docker run -e NEO4J_AUTH=none \
  -e NEO4J_dbms_security_procedures_unrestricted=apoc.\\\* \
  -e NEO4J_apoc_import_file_enabled=true \
  -e NEO4J_dbms_memory_pagecache_size=4G \
  -e NEO4J_dbms_memory_heap_maxSize=4G \
  --rm \
  --name img \
  --publish=7474:7474 \
  --publish=7687:7687 \
  -v ./proj/data:/data \
  -v ./proj/import:/var/lib/neo4j/import \
  -v ./proj/plugins:/plugins \
  -v ./proj/conf:/var/lib/neo4j/conf \
  neo4j

Combining both these apoc calls, we can take an input, (in this case a list of Named Entity Recognitions):-

[
  {
    "confidence": 99.0,
    "entities": [
      {
        "entity": ["NEO4J"],
        "tag": "I-ORG"
      }
    ],
    "id": "11819",
    "locale": "en",
    "read_bytes": 1214
  },
  {
    "confidence": 99.0,
    "entities": [
      {
        "entity": ["ATLASSIAN"],
        "tag": "I-ORG"
      },
      {
        "entity": ["APPLE"],
        "tag": "I-ORG"
      }
    ],
    "id": "11820",
    "locale": "en",
    "read_bytes": 1186
  }
]

And load it like so:-

CALL apoc.periodic.iterate("
    CALL apoc.load.json('file:///ner.json')
    YIELD value AS parsedResponse RETURN parsedResponse
", "
MATCH (c:Crime) WHERE c.source_key = parsedResponse.id

FOREACH(entity IN parsedResponse.entities |

    // Person
    FOREACH(_ IN CASE WHEN entity.tag = 'I-PER' THEN [1] ELSE [] END |
        MERGE (p:NER_Person {name: trim(reduce(s = \"\", x IN entity.entity | s + x + \" \"))})
        MERGE (p)<-[:CONTAINS_ENTITY]-(c)
    )

    // Organization
    FOREACH(_ IN CASE WHEN entity.tag = 'I-ORG' THEN [1] ELSE [] END |
        MERGE (o:NER_Organization {name: trim(reduce(s = \"\", x IN entity.entity | s + x + \" \"))})
        MERGE (o)<-[:CONTAINS_ENTITY]-(c)
    )

    // Location
    FOREACH(_ IN CASE WHEN entity.tag = 'I-LOC' THEN [1] ELSE [] END |
        MERGE (l:NER_Location {name: trim(reduce(s = \"\", x IN entity.entity | s + x + \" \"))})
        MERGE (l)<-[:CONTAINS_ENTITY]-(c)
    )
)",
{
    batchSize: 10000,
    iterateList: true
}
);

This completes much quicker and loads the data gradually in batches opposed to other methods, especially for huge/large datasets.

Jan 21, 2018

Lexi - Greek Word of the Day

iOS app for showing a random Greek Word of the Day, built with React Native and Expo.

Technologies Used

Introduction

DeepDetect (DD) is a open source deep-learning server and API designed to help in bridging the gap toward machine learning as a commodity. It originates from a series of applications built for a handful of large corporations and small startups. It has support for Caffe, one of the most appreciated libraries for deep learning, and it easily connects to a range of sources and sinks.

This enables deep learning to fit into existing stacks and applications with reduced effort.

This short tutorial aims to show how to provide a service an image and return the nearest probabilities of a match from that input, setting up your own facial recognition server with DD.

Set up DD

First of all, pull & run a DD container using Docker:-

docker run -p 8080:8080 --name dd beniz/deepdetect_cpu

This should take a few minutes to download and run.

Download Classification Model

Download the Visual Geometry Group’s face descriptors (direct link) to allow DD to classify face images.

The VGG descriptors (i.e. descriptions of the visual features of the contents in images, videos, etc) are evaluated evaluated on the Labeled Faces in the Wild dataset, a standard de facto for measuring face recognition performances.

Install Classification Model

Extract the face descriptors to a directory:-

total 564M
-rw-r--r-- 1 gavin 1.1K Oct 13  2015 COPYING
-rw-r--r-- 1 gavin 1.4K Oct 13  2015 README
-rw-r--r-- 1 gavin 554M Oct 13  2015 VGG_FACE.caffemodel
-rw-r--r-- 1 gavin 4.8K Nov  1 18:42 VGG_FACE_deploy.prototxt
-rw-r--r-- 1 gavin  59K Oct 13  2015 ak.png
-rw-r--r-- 1 gavin  630 Oct 13  2015 matcaffe_demo.m
-rw-r--r-- 1 gavin  37K Oct 13  2015 names.txt

Replace `deploy.prototxt` file

According to this issue the VGG_FACE_deploy.prototxt file (i.e. definition of input blobs) is based on an older version of caffe which has to be updated for DD, thus download deploy.prototxt and overwrite VGG_FACE_deploy.prototxt within your extracted directory.

Create `corresp.txt` file

DD requires a correspondence file to turn vgg_face categories, such as ‘1014’ into textual categories such as ‘Tommy Flanagan’. When training a model with DD, this file is automatically generated. However, since we are using a pre-trained model from outside DD, this file has to be explicitly added to the repository.

Download corresp.txt to the vgg_face_caffe directory as above.

The directory should now look as follows-

total 564M
-rw-r--r-- 1 gavin 1.1K Oct 13  2015 COPYING
-rw-r--r-- 1 gavin 1.4K Oct 13  2015 README
-rw-r--r-- 1 gavin 554M Oct 13  2015 VGG_FACE.caffemodel
-rw-r--r-- 1 gavin 4.8K Nov  1 18:42 VGG_FACE_deploy.prototxt
-rw-r--r-- 1 gavin  59K Oct 13  2015 ak.png
-rw-r--r-- 1 gavin  48K Nov  4 11:55 corresp.txt
-rw-r--r-- 1 gavin  630 Oct 13  2015 matcaffe_demo.m
-rw-r--r-- 1 gavin  37K Oct 13  2015 names.txt

Copy to container

Now we’re ready to copy this directory to the running DD container instance /opt/models/ folder like so:-

docker cp vgg_face_caffe dd:/opt/models 
docker exec -it --user root dd chown dd:dd /opt/models/vgg_face_caffe      

Optional, but helpful to verify the directory exists within the container:-

docker exec -it dd /bin/sh -c "cd /opt/models; ls -al; bash"

Should output:-

total 20
drwxr-xr-x 11 dd   dd   4096 Nov  4 09:47 .
drwxr-xr-x 18 root root 4096 Nov  4 09:47 ..
drwxr-xr-x  2 dd   dd   4096 Sep 18 07:20 ggnet
drwxr-xr-x  2 dd   dd   4096 Sep 18 07:20 resnet_50
drwxr-xr-x  2 dd   dd   4096 Nov  1 18:42 vgg_face_caffe
dd@30109d227487:/opt/models$

Create Classification Service

The next step is to create the new classification service for DD by issuing the following request to the running DD instance:-

curl -X PUT "http://<docker machine ip>:8080/services/face" -d '
{
  "mllib":"caffe",
  "description":"face recognition service",
  "type":"supervised",
  "parameters":{
    "input":{
      "connector":"image",
      "width":224,
      "height":224
    },
    "mllib":{
      "nclasses":1000
    }
  },
  "model":{
    "templates":"../templates/caffe/",
    "repository":"/opt/models/vgg_face_caffe"
  }
}'

This should then output:-

{"status":{"code":201,"msg":"Created"}}

Providing Inputs

We’re now ready to provide an input to our service. Let’s try and identify an image of Jeff Goldblum:-

Pass the URL of this image into the predict service of DD (making sure to specify the correct service name):-

curl -X POST "http://<docker machine ip>:8080/predict" -d '{
  "service":"face",
  "parameters":{
    "input":{
      "width":224,
      "height":224
    },
    "output":{
      "best":3
    }
  },
  "data":["https://pixel.nymag.com/imgs/daily/vulture/2016/04/15/15-jeff-goldblum.w710.h473.2x.jpg"]
}' | jq '.'

Piped into jq for a nicer response:-

{
  "status": {
    "code": 200,
    "msg": "OK"
  },
  "head": {
    "method": "/predict",
    "service": "face",
    "time": 7901
  },
  "body": {
    "predictions": [
      {
        "uri": "https://pixel.nymag.com/imgs/daily/vulture/2016/04/15/15-jeff-goldblum.w710.h473.2x.jpg",
        "classes": [
          {
            "prob": 0.7104721665382385,
            "cat": "Jeff Goldblum"
          },
          {
            "prob": 0.09826808422803879,
            "cat": "Tommy Flanagan"
          },
          {
            "prob": 0.08003903180360794,
            "last": true,
            "cat": "Anna Gunn"
          }
        ]
      }
    ]
  }
}

As you can see from the output above, there is around a 70% chance of the input image indeed being that of Jeff Goldblum.

Resources

Newer

Older

Page 1 of 5

Neo4j Behind Traefik

Bulk APOC JSON Load

Lexi - Greek Word of the Day

Technologies Used

Links

Face Recognition with DeepDetect

Introduction

Set up DD

Download Classification Model

Install Classification Model

Replace `deploy.prototxt` file

Create `corresp.txt` file

Copy to container

Create Classification Service

Providing Inputs

Resources

Neo4j Behind Traefik

Bulk APOC JSON Load

Lexi - Greek Word of the Day

Technologies Used

Links

Face Recognition with DeepDetect

Introduction

Set up DD

Download Classification Model

Install Classification Model

Replace deploy.prototxt file

Create corresp.txt file

Copy to container

Create Classification Service

Providing Inputs

Resources

Replace `deploy.prototxt` file

Create `corresp.txt` file