Kwalify Users' Guide (for Ruby and Java)
last update: $Date: 2005-12-20 12:50:56 +0900 (Tue, 20 Dec 2005) $
Preface
Kwalify(*1) is a tiny schema validator for YAML and JSON document.
You know "80-20 rule" known as Pareto Law, don't you? This rule suggests that 20% of the population owns 80% of the wealth. Kwalify is based on a new "50-5 rule" which suggests that 5% of the population owns 50% of the wealth. This rule is more aggressive and cost-effective than Pareto Law. The rule is named as "Levi's Law".
| schema technology | (A) cover range | (B) cost to pay | (A)/(B) effectiveness |
|---|---|---|---|
| XML Schema | 95% | 100% | 0.95 (= 95/100) |
| RelaxNG | 80% | 20% | 4.0 (= 80/20) |
| Kwalify | 50% | 5% | 10.0 (= 50/5) |
Kwalify is small and in fact poorer than RelaxNG or XML Schema. I hope you extend/customize Kwalify for your own way.
Table of Contents:- (*1)
- Pronounce as 'Qualify'.
Usage of Kwalify
Usage in Command-Line
### kwalify-ruby $ kwalify -f schema.yaml document.yaml [document2.yaml ...] ### kwalify-java $ java -classpath kwalify.jar kwalify.Main -f schema.yaml document.yaml [document2.yaml ...]
### kwalify-ruby $ kwalify -m schema.yaml [schema2.yaml ...] ### kwalify-java $ java -classpath kwalify.jar kwalify.Main -m schema.yaml [schema2.yaml ...]
Command-line options:
-
-h,--help - Print help message.
-
-v - Print version.
-
-s - Silent mode.
-
-f schema.yaml - Specify schema definition file.
-
-m - Meta-validation of schema definition.
-
-t - Expand tab characters to spaces automatically.
-
-l - Show linenumber on which error found.
-
-E - Show errors in Emacs-compatible style (implies '-l' option).
Notice that the command-line option -l is an experimental feature, for kwalify command use original YAML parser instead of Syck parser when this option is specified.
If you are an Emacs user, try -E option that show errors in format which Emacs can parse and jump to errors.
You can use C-x ` (next-error) to jump into errors.
Usage in Ruby Script
The followings are example scripts for Ruby.
require 'kwalify'
## parse schema definition and create validator
schema = YAML.load_file('schema.yaml')
validator = Kwalify::Validator.new(schema) # raises Kwalify::SchemaError if wrong
## validate YAML document
document = YAML.load_file('document.yaml')
error_list = validator.validate(document)
unless error_list.empty?
error_list.each do |error| # error is instance of Kwalify::ValidationError
puts "[#{error.path}] #{error.message}"
end
end
require 'kwalify'
## parse schema definition and create validator
schema = YAML.load_file('schema.yaml')
validator = Kwalify::Validator.new(schema) # raises Kwalify::SchemaError if wrong
## parse YAML document with Kwalify's parser
str = File.read('document.yaml')
parser = Kwalify::Parser.new(str)
document = parser.parse()
## validate document and show errors
error_list = validator.validate(document)
unless error_list.empty?
parser.set_errors_linenum(error_list) # set linenum on error
error_list.sort.each do |error|
puts "(line %d)[%s] %s" % [error.linenum, error.path, error.message]
end
end
Kwalify's YAML parser is experimental. You should notice that Kwalify's YAML parser is limited only for basic syntax of YAML.
The followings are example programs of Java.
import kwalify.*;
public class Test {
public static void main(String[] args) throws Exception {
// read schema
String schema_str = Util.readFile("schema.yaml");
Object schema = new YamlParser(schema_str).parse();
// read document file
String document_str = Util.readFile("document.yaml");
YamlParser parser = new YamlParser(document_str);
Object document = parser.parse();
// create validator and validate
Validator validator = new Validator(schema);
List errors = validator.validate(document);
// show errors
if (errors != null && errors.size() > 0) {
parser.setErrorsLineNumber(errors);
Collections.sort(errors);
for (Iterator it = errors.iterator(); it.hasNext(); ) {
ValidationException error = (ValidationException)it.next();
int linenum = error.getLineNumber();
String path = error.getPath();
String mesg = error.getMessage();
System.out.println("- " + linenum + ": [" + path + "] " + mesg);
}
}
}
}
Schema Definition
Sequence
schema01.yaml : sequence of stringtype: seq sequence: - type: str
document01a.yaml : valid document example- foo - bar - baz
$ kwalify -lf schema01.yaml document01a.yaml document01a.yaml#0: valid.
document01b.yaml : invalid document example- foo - 123 - baz
$ kwalify -lf schema01.yaml document01b.yaml document01b.yaml#0: INVALID - (line 2) [/1] '123': not a string.
Default 'type:' is str so you can omit 'type: str'.
Mapping
schema02.yaml : mapping of scalartype: map
mapping:
name:
type: str
required: yes
email:
type: str
pattern: /@/
age:
type: int
birth:
type: date
document02a.yaml : valid document examplename: foo email: foo@mail.com age: 20 birth: 1985-01-01
$ kwalify -lf schema02.yaml document02a.yaml document02a.yaml#0: valid.
document02b.yaml : invalid document examplename: foo email: foo(at)mail.com age: twenty birth: Jun 01, 1985
$ kwalify -lf schema02.yaml document02b.yaml document02b.yaml#0: INVALID - (line 2) [/email] 'foo(at)mail.com': not matched to pattern /@/. - (line 3) [/age] 'twenty': not a integer. - (line 4) [/birth] 'Jun 01, 1985': not a date.
Sequence of Mapping
schema03.yaml : sequence of mappingtype: seq
sequence:
- type: map
mapping:
name:
type: str
required: true
email:
type: str
document03a.yaml : valid document example- name: foo email: foo@mail.com - name: bar email: bar@mail.net - name: baz email: baz@mail.org
$ kwalify -lf schema03.yaml document03a.yaml document03a.yaml#0: valid.
document03b.yaml : invalid document example- name: foo email: foo@mail.com - naem: bar email: bar@mail.net - name: baz mail: baz@mail.org
$ kwalify -lf schema03.yaml document03b.yaml document03b.yaml#0: INVALID - (line 3) [/1] key 'name:' is required. - (line 3) [/1/naem] key 'naem:' is undefined. - (line 6) [/2/mail] key 'mail:' is undefined.
Mapping of Sequence
schema04.yaml : mapping of sequence of mappingtype: map
mapping:
company:
type: str
required: yes
email:
type: str
employees:
type: seq
sequence:
- type: map
mapping:
code:
type: int
required: yes
name:
type: str
required: yes
email:
type: str
document04a.yaml : valid document examplecompany: Kuwata lab.
email: webmaster@kuwata-lab.com
employees:
- code: 101
name: foo
email: foo@kuwata-lab.com
- code: 102
name: bar
email: bar@kuwata-lab.com
$ kwalify -lf schema04.yaml document04a.yaml document04a.yaml#0: valid.
document04b.yaml : invalid document examplecompany: Kuwata Lab.
email: webmaster@kuwata-lab.com
employees:
- code: A101
name: foo
email: foo@kuwata-lab.com
- code: 102
name: bar
mail: bar@kuwata-lab.com
$ kwalify -lf schema04.yaml document04b.yaml document04b.yaml#0: INVALID - (line 4) [/employees/0/code] 'A101': not a integer. - (line 9) [/employees/1/mail] key 'mail:' is undefined.
Rule and Entry
Rule is set of entries. Entry usually represents constraint outside of a few exceptions.
The followings are constraint entries.
-
required: - Value is required when true (default is false).
-
enum: - List of available values.
-
pattern: - Specifies regular expression pattern of value.
-
type: -
Type of value. The followings are available:
strintfloatnumber(== int or float)text(== str or number)booldatetimetimestampseqmapscalar(all but seq and map)any(means any data)
-
range: -
Range of value between max/max-ex and min/min-ex.
- 'max' means 'max-inclusive'.
- 'min' means 'min-inclusive'.
- 'max-ex' means 'max-exclusive'.
- 'min-ex' means 'min-exclusive'.
seq,map,boolandanyare not available withrange:. -
length: -
Range of length of value between max/max-ex and min/min-ex. Only type
strandtextare available withlength:. -
assert: -
String which represents validation expression. String should contain variable name
valwhich repsents value. (This is an experimental function and supported only Kwartz-ruby). -
unique: - Value is unique for mapping or sequence. See the next subsection for detail.
The followings are non-constraint entries.
-
name: - Name of schema.
-
desc: - Description. This is not used for validation.
Rule contains 'type:' entry. 'sequence:' entry takes a list of rule. 'mapping:' entry takes a hash which values are rules.
schema05.yaml : rule examplestype: seq # new rule
sequence:
-
type: map # new rule
mapping:
name:
type: str # new rule
required: yes
email:
type: str # new rule
required: yes
pattern: /@/
password:
type: text # new rule
length: { max: 16, min: 8 }
age:
type: int # new rule
range: { max: 30, min: 18 }
# or assert: 18 <= val && val <= 30
blood:
type: str # new rule
enum:
- A
- B
- O
- AB
birth:
type: date # new rule
memo:
type: any # new rule
document05a.yaml : valid document example- name: foo email: foo@mail.com password: xxx123456 age: 20 blood: A birth: 1985-01-01 - name: bar email: bar@mail.net age: 25 blood: AB birth: 1980-01-01
$ kwalify -lf schema05.yaml document05a.yaml document05a.yaml#0: valid.
document05b.yaml : invalid document example- name: foo email: foo(at)mail.com password: xxx123 age: twenty blood: a birth: 1985-01-01 - given-name: bar family-name: Bar email: bar@mail.net age: 15 blood: AB birth: 1980/01/01
$ kwalify -lf schema05.yaml document05b.yaml document05b.yaml#0: INVALID - (line 2) [/0/email] 'foo(at)mail.com': not matched to pattern /@/. - (line 3) [/0/password] 'xxx123': too short (length 6 < min 8). - (line 4) [/0/age] 'twenty': not a integer. - (line 5) [/0/blood] 'a': invalid blood value. - (line 7) [/1/given-name] key 'given-name:' is undefined. - (line 7) [/1] key 'name:' is required. - (line 8) [/1/family-name] key 'family-name:' is undefined. - (line 10) [/1/age] '15': too small (< min 18). - (line 12) [/1/birth] '1980/01/01': not a date.
Unique constraint
'unique:' constraint entry is available with elements of sequence or mapping.
This is equivalent to unique key or primary key of RDBMS.
Type of rule which has 'unique:' entry must be scalar (str, int, float, ...).
Type of parent rule must be sequence or mapping.
schema06.yaml : unique constraint entry with mapping and sequencetype: seq
sequence:
- type: map
required: yes
mapping:
name:
type: str
required: yes
unique: yes
email:
type: str
groups:
type: seq
sequence:
- type: str
unique: yes
document06a.yaml : valid document example- name: foo
email: admin@mail.com
groups:
- users
- foo
- admin
- name: bar
email: admin@mail.com
groups:
- users
- admin
- name: baz
email: baz@mail.com
groups:
- users
$ kwalify -lf schema06.yaml document06a.yaml document06a.yaml#0: valid.
document06b.yaml : invalid document example- name: foo
email: admin@mail.com
groups:
- foo
- users
- admin
- foo
- name: bar
email: admin@mail.com
groups:
- admin
- users
- name: bar
email: baz@mail.com
groups:
- users
$ kwalify -lf schema06.yaml document06b.yaml document06b.yaml#0: INVALID - (line 7) [/0/groups/3] 'foo': is already used at '/0/groups/0'. - (line 13) [/2/name] 'bar': is already used at '/1/name'.
Validator#validator_hook()
You can extend Kwalify::Validator class (Ruby) or kwalify.Validator class (Java), and override Kwalify::Validator#validator_hook() method (Ruby) or kwalify.Validator#validateHook() method (Java). This method is called by Kwalify::Validator#validate() (Ruby) or kwalify.Validator#validate() (Java).
type: map
mapping:
answers:
type: seq
sequence:
- type: map
name: Answer
mapping:
name:
type: str
required: yes
answer:
type: str
required: yes
enum:
- good
- not bad
- bad
reason:
type: str
#!/usr/bin/env ruby
require 'kwalify'
require 'yaml'
## validator class for answers
class AnswersValidator < Kwalify::Validator
## load schema definition
@@schema = YAML.load_file('answers-schema.yaml')
def initialize()
super(@@schema)
end
## hook method called by Validator#validate()
def validate_hook(value, rule, path, errors)
case rule.name
when 'Answer'
if value['answer'] == 'bad'
reason = value['reason']
if !reason || reason.empty?
msg = "reason is required when answer is 'bad'."
errors << Kwalify::ValidationError.new(msg, path)
end
end
end
end
end
## create validator
validator = AnswersValidator.new
## load YAML document
input = ARGF.read()
document = YAML.load(input)
## validate
errors = validator.validate(document)
if errors.empty?
puts "Valid."
else
puts "*** INVALID!"
errors.each do |error|
# error.class == Kwalify::ValidationError
puts " - [#{error.path}] : #{error.message}"
end
end
document07a.yaml : valid document exampleanswers:
- name: Foo
answer: good
reason: I like this style.
- name: Bar
answer: not bad
- name: Baz
answer: bad
reason: I don't like this style.
$ ruby answers-validator.rb document07a.yaml Valid.
document07b.yaml : invalid document exampleanswers:
- name: Foo
answer: good
- name: Bar
answer: bad
- name: Baz
answer: not bad
$ ruby answers-validator.rb document07b.yaml *** INVALID! - [/answers/1] : reason is required when answer is 'bad'.
You can validate some document by a Validator instance because Validator class and Validator#validate() method are stateless. If you use instance variables in custom validator_hook() method, it becomes to be stateful.
Here is a Java program equivarent to 'answers-validator.rb'.
import kwalify.Validator;
import kwalify.Rule;
import kwalify.Util;
import kwalify.YamlUtil;
import kwalify.YamlParser;
import kwalify.SyntaxException;
import kwalify.ValidationException;
import java.util.*;
import java.io.IOException;
/**
* validator class for answers
*/
public class AnswersValidator extends Validator {
/** schema string */
private static final String SCHEMA = ""
+ "type: map\n"
+ "mapping:\n"
+ " answers:\n"
+ " type: seq\n"
+ " sequence:\n"
+ " - type: map\n"
+ " name: Answer\n"
+ " mapping:\n"
+ " name:\n"
+ " type: str\n"
+ " required: yes\n"
+ " answer:\n"
+ " type: str\n"
+ " required: yes\n"
+ " enum:\n"
+ " - good\n"
+ " - not bad\n"
+ " - bad\n"
+ " reason:\n"
+ " type: str\n"
;
/** schema object */
private static Map schema = null;
static {
try {
schema = (Map)YamlUtil.load(SCHEMA);
} catch (SyntaxException ex) {
assert false;
}
}
/** construnctor */
public AnswersValidator() {
super(schema);
}
/** hook method called by Validator#validate() */
protected void validateHook(Object value, Rule rule, String path, List errors) {
String rule_name = rule.getName();
if (rule_name != null && rule_name.equals("Answer")) {
assert value instanceof Map;
Map val = (Map)value;
assert val.get("answer") != null;
if (val.get("answer").equals("bad")) {
String reason = (String)val.get("reason");
if (reason == null || reason.length() == 0) {
String msg = "reason is required when answer is 'bad'.";
errors.add(new ValidationException(msg, path));
}
}
}
}
/** main program */
public static void main(String[] args) throws IOException, SyntaxException {
// create validator
Validator validator = new AnswersValidator();
// load YAML document
String input;
if (args.length > 0) {
input = Util.readFile(args[0]);
} else {
input = Util.readInputStream(System.in);
}
YamlParser parser = new YamlParser(input);
Object document = parser.parse();
// validate and show errors
List errors = validator.validate(document);
if (errors == null || errors.size() == 0) {
System.out.println("Valid.");
} else {
System.out.println("*** INVALID!");
parser.setErrorsLineNumber(errors);
Collections.sort(errors);
for (Iterator it = errors.iterator(); it.hasNext(); ) {
ValidationException error = (ValidationException)it.next();
int linenum = error.getLineNumber();
String path = error.getPath();
String mesg = error.getMessage();
String s = "- line " + linenum + ": [" + path + "] " + mesg;
System.out.println(s);
}
}
}
}
$ java -classpath kwalify.jar AnswersValidator document07a.yaml Valid. $ java -classpath kwalify.jar AnswersValidator document07b.yaml *** INVALID! - line 4: [/answers/1] reason is required when answer is 'bad'.
Validator with Block
Notice: This is an experimental feature.
Kwalify::Validator.new() method can take a block which is invoked when validation.
validate08.rb : validate script#!/usr/bin/env ruby
require 'kwalify'
require 'yaml'
## load schema definition
schema = YAML.load_file('answers-schema.yaml')
## create validator for answers
validator = Kwalify::Validator.new(schema) { |value, rule, path, errors|
case rule.name
when 'Answer'
if value['answer'] == 'bad'
reason = value['reason']
if !reason || reason.empty?
msg = "reason is required when answer is 'bad'."
errors << Kwalify::ValidationError.new(msg, path)
end
end
end
}
## load YAML document
input = ARGF.read()
document = YAML.load(input)
## validate
errors = validator.validate(document)
if errors.empty?
puts "Valid."
else
puts "*** INVALID!"
errors.each do |error|
# error.class == Kwalify::ValidationError
puts " - [#{error.path}] : #{error.message}"
end
end
$ ruby validate08.rb document07a.yaml Valid.
$ ruby validate08.rb document07b.yaml *** INVALID! - [/answers/1] : reason is required when answer is 'bad'.
Tips
Enclose Key Names in (Double) Quotes
It is allowed to enclose key name in quotes (') or double-quotes (") in YAML. This tip highlights user-defined key names.
schema11a.yaml : enclosing in double-quotestype: map
mapping:
"name":
required: yes
"email":
pattern: /@/
"age":
type: int
"birth":
type: date
You may prefer to indent with 1 space and 3 spaces.
schema11b.yaml : indent with 1 space and 3 spacestype: map
mapping:
"name":
required: yes
"email":
pattern: /@/
"age":
type: int
"birth":
type: date
JSON
JSON is a lightweight data-interchange format, especially useful for JavaScript. JSON can be considered as a subset of YAML. It means that YAML parser can parse JSON and Kwalify can validate JSON document.
schema12.yaml : an example schema written in JSON format{ "type": "map",
"required": true,
"mapping": {
"name": {
"type": "str",
"required": true
},
"email": {
"type": "str"
},
"age": {
"type": "int"
},
"gender": {
"type": "str",
"enum": ["M", "F"]
},
"favorite": {
"type": "seq",
"sequence": [
{ "type": "str" }
]
}
}
}
document12a.yaml : valid JSON document example{ "name": "Foo",
"email": "foo@mail.com",
"age": 20,
"gender": "F",
"favorite": [
"football",
"basketball",
"baseball"
]
}
$ kwalify -lf schema12.yaml document12a.yaml document12a.yaml#0: valid.
document12b.yaml : invalid JSON document example{
"mail": "foo@mail.com",
"age": twenty,
"gender": "X",
"favorite": [ 123, 456 ]
}
$ kwalify -lf schema12.yaml document12b.yaml document12b.yaml#0: INVALID - (line 1) [/] key 'name:' is required. - (line 2) [/mail] key 'mail:' is undefined. - (line 3) [/age] 'twenty': not a integer. - (line 4) [/gender] 'X': invalid gender value. - (line 5) [/favorite/0] '123': not a string. - (line 5) [/favorite/1] '456': not a string.
Anchor
You can share schemas with YAML anchor.
schema13.yaml : anchor exampletype: seq
sequence:
- &employee
type: map
mapping:
"given-name": &name
type: str
required: yes
"family-name": *name
"post":
enum:
- exective
- manager
- clerk
"supervisor": *employee
Anchor is also available in YAML document.
document13a.yaml : valid document example- &foo given-name: foo family-name: Foo post: exective - &bar given-name: bar family-name: Bar post: manager supervisor: *foo - given-name: baz family-name: Baz post: clerk supervisor: *bar - given-name: zak family-name: Zak post: clerk supervisor: *bar
$ kwalify -lf schema13.yaml document13a.yaml document13a.yaml#0: valid.
Default of Mapping
YAML allows user to specify default value of mapping.
For example, the following YAML document uses default value of mapping.
A: 10 B: 20 =: -1 # default value
This is equal to the following Ruby code.
map = ["A"=>10, "B"=>20] map.default = -1 map
Kwalify allows user to specify default rule using default value of mapping. It is useful when key names are unknown.
schema14.yaml : default rule exampletype: map
mapping:
=: # default rule
type: number
range: { max: 1, min: -1 }
document14a.yaml : valid document examplevalue1: 0 value2: 0.5 value3: -0.9
$ kwalify -lf schema14.yaml document14a.yaml document14a.yaml#0: valid.
document14b.yaml : invalid document examplevalue1: 0 value2: 1.1 value3: -2.0
$ kwalify -lf schema14.yaml document14b.yaml document14b.yaml#0: INVALID - (line 2) [/value2] '1.1': too large (> max 1). - (line 3) [/value3] '-2.0': too small (< min -1).
Merging Mappings
YAML allows user to merge mappings.
- &a1 A: 10 B: 20 - <<: *a1 # merge A: 15 # override C: 30 # add
This is equal to the following Ruby code.
a1 = {"A"=>10, "B"=>20}
tmp = {}
tmp.update(a1) # merge
tmp["A"] = 15 # override
tmp["C"] = 30 # add
This feature allows Kwalify to merge rule entries.
schema15.yaml : merging rule entries exampletype: map
mapping:
"group":
type: map
mapping:
"name": &name
type: str
required: yes
"email": &email
type: str
pattern: /@/
required: no
"user":
type: map
mapping:
"name":
<<: *name # merge
length: { max: 16 } # override
"email":
<<: *email # merge
required: yes # add
document15a.yaml : valid document examplegroup: name: foo email: foo@mail.com user: name: bar email: bar@mail.com
$ kwalify -lf schema15.yaml document15a.yaml document15a.yaml#0: valid.
document15b.yaml : invalid document examplegroup: name: foo email: foo@mail.com user: name: toooooo-looooong-name
$ kwalify -lf schema15.yaml document15b.yaml document15b.yaml#0: INVALID - (line 4) [/user] key 'email:' is required. - (line 5) [/user/name] 'toooooo-looooong-name': too long (length 21 > max 16).