Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

WXR Import: Import posts as correct user #1477

Merged
merged 1 commit into from
Jun 2, 2024

Conversation

dd32
Copy link
Member

@dd32 dd32 commented May 31, 2024

What is this PR doing?

This attempts to set the correct user for WXR imports.

Reading the source for the WXR importer used, it appears default_author should be set to the ID of the user (not the WP_User) and will otherwise also be set to the current loaded user.

https://github.com/humanmade/WordPress-Importer/blob/140a53eb597de87d786cf942c8fe74fc97432f14/class-wxr-importer.php#L794-L807

What problem is it solving?

Fixes #1446 Imports lack an author.

How is the problem addressed?

Using the admin ID, and calling wp_set_current_user().

Testing Instructions

Note: This is an untested change.

Run the below playground, and verify if the imported post has an author set.

https://playground.wordpress.net/?mode=seamless#{%22preferredVersions%22:{%22php%22:%228.0%22,%22wp%22:%22latest%22},%22phpExtensionBundles%22:[%22kitchen-sink%22],%22features%22:{%22networking%22:true},%22landingPage%22:%22wp-admin/edit.php%22,%22steps%22:[{%22step%22:%22importWxr%22,%22file%22:{%22resource%22:%22url%22,%22url%22:%22https://raw.githubusercontent.com/x3p0-dev/assets/master/x3p0-ideas/site-content.xml%22}},{%22step%22:%22login%22,%22username%22:%22admin%22,%22password%22:%22password%22}]}

@bgrgicak bgrgicak merged commit 5927ee7 into trunk Jun 2, 2024
5 checks passed
@bgrgicak bgrgicak deleted the try/1446-import-correct-user branch June 2, 2024 20:38
adamziel added a commit that referenced this pull request Dec 11, 2024
…2058)

## Description

Adds the Data Liberation WXR importer as an option in the `importWxr`
step. The new importer is turned by including the `"importer":
"data-liberation"` option:

```json
{
  "steps": [
    {
      "step": "importWxr",
      "file": {
        "resource": "url",
        "url": "https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml"
      },
      "importer": "data-liberation"
    }
  ]
}
```

When the `importer` option is missing or set to "default," nothing
changes in the behavior of the step and it continues using the
https://github.com/humanmade/WordPress-Importer importer.

The new importer:

* Rewrites links in the imported content
* Downloads assets through Playground's CORS proxy
* Parallelizes the downloads
* Communicates progress

This PR is a part of
#1894

## Implementation details

This `importWxr` step fetches and includes the
`data-liberation-core.phar` file. The phar file is built with
[Box](https://box-project.github.io/box/configuration/) and contains the
importer library with its dependencies, which is a subset of the Data
Liberation library, a subset of the Blueprints library, and a few vendor
libraries.

This, unfortunately, means that any changes in the PHP files require
rebuilding the .phar file. Here's how you can do it:

```bash
nx build:phar playground-data-liberation
```

You can also build the entire Data Liberation package as a WordPress
plugin complete with a wp-admin page:

```bash
nx build:plugin playground-data-liberation
```

Both commands will output the built files to
`packages/playground/data-liberation/dist`

The progress updates are a first-class feature of the new importer. The
updated `importer` step receives them in real-time via a
`post_message_to_js()` call running after every import step. Then, it
passes them on to the progress bar UI.

### Other changes

* **TLS traffic now goes through the CORS proxy.** Since the new
importer uses `AsyncHTTP\Client` which deals with raw sockets,
Playground's [TLS-based network
bridge](#1926)
runs the outbound traffic through a cors proxy. Technically,
`TCPOverFetchWebsocket` gets the `corsProxy` URL passed to the
`playground.boot()` call.
* A few composer dependencies were forked, downgraded to PHP 7.2 using
Rector, and bundled with this PR to keep the Data Liberation importer
working.

## Remaining work

- [x] PHP 7.2 compatibility. Done by forking and Rector-downgrading
dependencies that were incompatible with PHP 7.2.
- [x] Report the importer's progress on the overall Blueprint progress
bar
- [x] Enqueue the data liberation plugin files for downloading at the
blueprint compilation stage
- [x] Don't eagerly rewrite attachments URLs in `WP_Stream_Importer`.
Exposing this information to the API consumer requires an explicit
decision. Do we rewrite it? Or do we ignore it?
- [x] Fix the TLS errors at the intersection of Playground network
transport and the async HTTP client library
- [x] Separate the markdown importer and its dependencies (md parser,
frontmatter parser, Symfony libraries) from the core plugin
- [x] Ship the importer and its tree-shaken deps (URL parser) as a
minified zip/phar

## Follow-up work

- [ ] Reconsider the `WP_Import_Session` API – do we need so many
verbosely named methods? Can we achieve the same outcomes with fewer
methods?
- [ ] Investigate why there's a significant delay before media downloads
start on PHP 7.2 – 7.4. It's likely a PHP.wasm issue.

## Testing instructions

* Default importer – [Open this
link](http://localhost:5400/website-server/#{%20%22plugins%22:%20[],%20%22steps%22:%20[%20{%20%22step%22:%20%22importWxr%22,%20%22file%22:%20{%20%22resource%22:%20%22url%22,%20%22url%22:%20%22https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml%22%20}%20}%20],%20%22preferredVersions%22:%20{%20%22php%22:%20%228.3%22,%20%22wp%22:%20%226.7%22%20},%20%22features%22:%20{%20%22networking%22:%20true%20},%20%22login%22:%20true%20})
and confirm it does what the current `importWxr` step do, that is it
stays at "Importing content" for a moment, fails to fetch media files
(CORS issues in network tools), but inserts posts and pages.
* Data Liberation – [Open this
link](http://localhost:5400/website-server/#{%20%22plugins%22:%20[],%20%22steps%22:%20[%20{%20%22step%22:%20%22importWxr%22,%20%22importer%22:%20%22data-liberation%22,%20%22file%22:%20{%20%22resource%22:%20%22url%22,%20%22url%22:%20%22https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml%22%20}%20}%20],%20%22preferredVersions%22:%20{%20%22php%22:%20%228.3%22,%20%22wp%22:%20%226.7%22%20},%20%22features%22:%20{%20%22networking%22:%20true%20},%20%22login%22:%20true%20}),
confirm the import progress is visible and that the content and media
indeed get imported:

![CleanShot 2024-12-08 at 14 54
49@2x](https://github.com/user-attachments/assets/a7da3244-a10f-43d2-8e94-43d305220a7e)

## Related issues

* #1211 
* #2012 
* #1477 
* #1250 
* #1780
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Import WXR: Post authors not mapping
2 participants