Exposing WordPress site data for mobile apps
WordPress is a terrific Content Management System. It is a great choice for managing online content, not just for our website, but also for other channels such as newsletters and mobile apps.
There is a problem though: the WordPress editor works with HTML, which is not always the most appropriate language to use for mobile apps. For instance, an app can provide a better user experience by playing a YouTube video in a native component for mobile, instead of parsing the embedded YouTube video from the HTML code.
This issue can be dealt with by transforming the HTML content to metadata: given a post, extract all of its important pieces of information (all the text in its paragraphs, all the sources of its images, all the URLs of its YouTube videos, etc) and make these available to the mobile app.
Luckily for us, Gutenberg (the not so new anymore editor for WordPress) simplifies this task, because it already organizes all the post content into blocks, which contain their own data. Then, extracting the content from the post becomes equivalent to extracting the data from each of the blocks contained in the post, which is not difficult to do.
Extracting block metadata
WordPress provides function parse_blocks
which receives the content from a post, and returns an array of the data contained in all the blocks within its content. Given a post, then, we can easily extract its metadata like this:
/**
* Return the block information from the post
*/
function get_post_block_metadata($post)
{
return parse_blocks($post->post_content);
}
We can manipulate the data too, such as extracting all the YouTube videos from the content. To achieve this, we iterate all the blocks in the post, filter those of type "core-embed/youtube"
, and finally extract the "url"
property from them:
/**
* Extract all the YouTube video URLs from all Embed YouTube video blocks inside the post
*/
function get_youtube_videos($post)
{
// Obtain the blocks from the content
$blocks = parse_blocks($post->post_content);
// Filter the Embed YouTube video blocks
$youtubeVideoBlocks = array_filter(
$blocks,
function($block) {
return $block['blockName'] == "core-embed/youtube";
}
);
// Extract the YouTube video URLs
return array_map(
function($block) {
return $block['attrs']['url'];
},
$youtubeVideoBlocks
);
}
Exposing the data for external consumption
Starting from version 4.7, WordPress can be considered a headless Content Management System, where “headless” means that the rendering of the website and its data are decoupled. WordPress was historically rendered in the server as HTML content, but since the addition of the WP REST API we can access the website’s data as a JSON object, and render this data through the tool of our choice (such as JavaScript libraries Vue and React, and others).
Now, instead of dealing with pure HTML content, like this:
<html>
<title>My blog post</title>
<body>
<p>Hello world!</p>
</body>
</html>
We can access the data through a JSON object, like this:
{
"title": {
"rendered": "My blog post"
},
"content": {
"rendered": "<p>Hello world!</p>"
}
}
Through REST, we publish endpoints (i.e. pre-defined URLs) which expose a certain content. Unless disabled, every website running WordPress 4.7 and above will have several default endpoints available, exposing common data, such as the following ones:
- List of posts: /wp-json/wp/v2/posts
- Single post: /wp-json/wp/v2/posts/${post_id}
Being in JSON format, we can fetch the website data to display it on our mobile apps. What we need to do now is to create REST endpoints to expose the block metadata from the post.
We could do this manually, processing all the different blocks and publishing their data into 1 or several endpoints. Luckily, there is a WordPress plugin that does exactly that, so we can save the effort. Let’s learn about it next.
REST endpoints for block metadata
The WordPress plugin Block Metadata (disclaimer: I am the author) makes it easy to extract and expose the metadata from all blocks inside a post, by exposing a REST endpoint to access the content:
- HTML content in all blocks in a single post: /wp-json/block-metadata/v1/data/${post_id}
In addition, it provides a second endpoint which converts the HTML code contained in the block into an agnostic format. The agnostic format enables the content to be used on any kind of application, including the more exceptional ones such as audio-controlled devices (like those powered by Amazon Alexa), augmented and virtual reality, a giant LED screen or a tiny Apple Watch. The endpoint is this one:
- Agnostic content in all blocks in a single post: /wp-json/block-metadata/v1/metadata/${post_id}
Let’s see an example response when extracting the metadata from a typical blog post containing paragraphs, lists, galleries of images, YouTube videos, quotes and others:
[
{
"blockName": "core\\/paragraph",
"meta": {
"content": "Lorem ipsum dolor sit amet..."
}
},
{
"blockName": "core\\/image",
"meta": {
"src": "https:\\/\\/ps.w.org\\/gutenberg\\/assets\\/banner-1544x500.jpg"
}
},
{
"blockName": "core\\/paragraph",
"meta": {
"content": "<em>Etiam tempor orci eu lobortis elementum nibh tellus molestie...<\\/em>"
}
},
{
"blockName": "core-embed\\/youtube",
"meta": {
"url": "https:\\/\\/www.youtube.com\\/watch?v=9pT-q0SSYow",
"caption": "<strong>This is the video caption<\\/strong>"
}
},
{
"blockName": "core\\/quote",
"meta": {
"quote": "Saramago sonogo\nEn la lista del longo",
"cite": "<em>alguno<\\/em>"
}
},
{
"blockName": "core\\/image",
"meta": {
"src": "https:\\/\\/ps.w.org\\/gutenberg\\/assets\\/banner-1544x500.jpg"
}
},
{
"blockName": "core\\/heading",
"meta": {
"size": "xl",
"heading": "Some heading here"
}
},
{
"blockName": "core\\/gallery",
"meta": {
"imgs": [
{
"src": "https:\\/\\/newapi.getpop.org\\/wp\\/wp-content\\/uploads\\/2020\\/01\\/IMG_1250.jpg",
"width": 1077,
"height": 808
},
{
"src": "https:\\/\\/newapi.getpop.org\\/wp\\/wp-content\\/uploads\\/2020\\/01\\/IMG_1770.jpg",
"width": 1077,
"height": 808
},
{
"src": "https:\\/\\/newapi.getpop.org\\/wp\\/wp-content\\/uploads\\/2020\\/01\\/IMG_1912.jpg",
"width": 1077,
"height": 808
}
]
}
},
{
"blockName": "core\\/list",
"meta": {
"items": [
"First element",
"Second element",
"Third element"
]
}
},
{
"blockName": "core\\/audio",
"meta": {
"src": false
}
},
{
"blockName": "core\\/paragraph",
"meta": {
"content": "Watch out the contrast!"
}
},
{
"blockName": "core\\/file",
"meta": {
"href": "https:\\/\\/www.w3.org\\/WAI\\/ER\\/tests\\/xhtml\\/testfiles\\/resources\\/pdf\\/dummy.pdf",
"text": "Contributor-Day <strong>download<\\/strong> file"
}
},
{
"blockName": "core\\/code",
"meta": {
"code": "function recursive_parse_blocks( $content ) {\n $ret = [];\n $blocks = parse_blocks( $content );\n recursive_add_blocks($ret, $blocks);\n return $ret;\n}"
}
},
{
"blockName": "core\\/preformatted",
"meta": {
"text": "Some pre-formated text"
}
},
{
"blockName": "core\\/pullquote",
"meta": {
"quote": "The will to win, the desire to succeed, the urge to reach your full potential\\u2026 these are the keys that will unlock the door to personal excellence.",
"cite": "Confucius"
}
},
{
"blockName": "core\\/verse",
"meta": {
"text": "It is easy to hate and it is difficult to love. This is how the whole scheme of things works. All good things are difficult to achieve; and bad things are very easy to get."
}
}
]
Please notice how the metadata extracted from each block is relevant to the type of block:
Block Type | Metadata properties |
---|---|
Paragraph |
|
Image |
|
YouTube video embed |
|
Heading |
|
Image gallery |
|
… |
This precise management of the data, based on their block type, provides granular control for data manipulation. For instance, we can extract only the URLs from all the YouTube videos contained in the post, and display them in the mobile app through a native video component, thus improving the user experience.
Extracting metadata from our own blocks
The plugin provides a filter "Leoloso\BlockMetadata\Metadata::blockMeta"
to enable custom blocks to specify how to extract their metadata. If the property to expose is already saved as an attribute in the Gutenberg block, then it can be immediately retrieved from the $block
object, under that same attribute name:
add_filter("Leoloso\BlockMetadata\Metadata::blockMeta", "myblock_extract_metadata", 10, 3);
function myblock_extract_metadata($blockMeta, $blockName, $block)
{
if ($blockName == "my-plugin/my-block-name") {
return [
"attribute1" => $block["attribute1"],
"attribute2" => $block["attribute2"]
];
}
return $blockMeta;
}
If the property is saved as part of the content in the block (i.e. it hasn’t been stored as an independent attribute) then we need to create a regular expression, or regex, to extract it from within the content stored under property innerHTML
:
add_filter("Leoloso\BlockMetadata\Metadata::blockMeta", "myblock_extract_metadata", 10, 3);
function myblock_extract_metadata($blockMeta, $blockName, $block)
{
if ($blockName == "my-plugin/my-block-name") {
$matches = [];
preg_match_all('/<p>(.*?)<\/p>/', $quoteHTML, $matches)
return [
"paragraphs" => $matches[1],
];
}
return $blockMeta;
}
Then, whenever the post contains our custom block, its data will be made available in the response from the REST endpoint:
[
{
"blockName": "my-plugin\\/my-block-name",
"meta": {
"attribute1": "...",
"attribute2": "..."
}
}
]
Conclusion
The advent of Gutenberg into WordPress has meant more than having a new interface for creating our posts: because it organizes the data through blocks, which can be processed according to their block type, we now have a powerful tool to manipulate data, with fine-grained control.
Combining Gutenberg with the WP REST API means that we can make the website data available on any format, for any device. For instance, we can retrieve all the YouTube video URLs from the post, as to display them through a native component in our mobile app.
Since HTML code is not always suitable for mobile apps, though, we need to first transform it to an agnostic format. We can do this task manually, or we can rely on the WordPress plugin Block Metadata to do it for us, and even incorporate our own blocks.
Leave a Reply