←
CDEV-30
Created: 2025-12-25
•
Updated: 2025-12-25
Relationships
Loading...
Attachments
Loading...
Comments (4)
qa-agent
·
2025-12-25
## QA PASSED
See QA report document: e08aef54-35e2-418f-aac0-a03329132a2e
**Test Results:**
- E2E test: 2 runs, both passed (~77s each)
- Incus client unit tests: 32/32 passed
- No "Instance is busy" errors observed
**Verification:**
- _wait_for_operation() now correctly loops until "Success" status
- 300s total timeout with 30s poll intervals working correctly
- Container creation time: ~77s (within expected 70-90s range)
All hypotheses tested. No issues found. Fix verified working.
agent
·
2025-12-25
**Fixed: Operation timeout causing race condition**
**Problem:**
The _wait_for_operation() function in app/services/incus.py was calling the Incus /wait?timeout=60 endpoint once with a 60-second timeout. When the 2.7GB pre-baked image took ~90 seconds to extract, the /wait endpoint returned after 60 seconds with the operation still in 'Running' status. The code didn't verify the operation was actually complete - it only checked for explicit 'Failure' status. This caused the code to proceed and try to start the container while creation was still in progress, resulting in:
"Failed to create instance start operation: Instance is busy running a 'create' operation"
**Solution:**
Modified _wait_for_operation() (lines 183-240) to:
1. Loop until the operation status is explicitly 'Success' (not just absence of failure)
2. Use a 300-second total timeout (5 minutes) instead of 60 seconds
3. Poll every 30 seconds (shorter intervals so we detect completion quickly)
4. Only return success when metadata.status == 'Success'
Also increased the httpx client timeout from 120s to 360s to accommodate longer operations.
**Verification:**
- E2E test passed in 71.57 seconds: test_connect_info_desktop_container
- All 32 Incus client unit tests pass
- Container creation now waits properly for the full image extraction
**Files changed:**
- /home/gateway/gitea/cdev/app/services/incus.py - Fixed _wait_for_operation() to loop until completion
qa-agent
·
2025-12-25
## QA FAILED
See QA report document: c0fedf07-f23c-47b3-affd-4de683b43676
**Critical Bug Found:** Operation timeout causes container creation failure
The pre-baked image (cdev-desktop, 2759.55MiB) takes ~90 seconds to extract, but the `_wait_for_operation()` function in `app/services/incus.py` times out after 60 seconds. When it times out, the code returns success and immediately tries to start the container, resulting in:
```
Failed to create instance start operation: Instance is busy running a 'create' operation
```
**Evidence:**
- E2E test failed after 60.68s with the above error
- Manual testing confirmed the create operation was still running after timeout
- Monitoring showed the operation took ~90 seconds total to complete
**Recommended Fix:**
1. Increase the default timeout in `_wait_for_operation()` to at least 120s
2. Or add a `wait_timeout` parameter to the `create()` method
Returning to todo for fixes.
agent
·
2025-12-25
## Implementation Complete
### What was done:
1. **Created `scripts/build-desktop-image.sh`** - Script to build a pre-baked Incus image containing:
- Xubuntu desktop with TigerVNC
- Node.js 22
- Google Chrome with Claude extension policy
- Claude Code CLI
- Tailscale (installed but not connected)
- SSH server
- Pre-configured 'xubuntu' user with VNC settings
2. **Added `list_images()` to IncusClient** (`app/services/incus.py:573-580`)
- Required for checking if pre-baked image exists
3. **Fixed IncusClient.create()** (`app/services/incus.py:307-311`)
- Changed to use 'alias' for local images instead of 'fingerprint'
4. **Added config settings** (`app/config.py:49-53`):
- `DESKTOP_IMAGE_ALIAS`: Image alias to use (default: "cdev-desktop")
- `DESKTOP_IMAGE_FALLBACK`: Whether to fall back to fresh install if image not found
5. **Added helper functions** (`app/services/container_manager.py`):
- `check_prebaked_image_exists()`: Check if pre-baked image exists
- `_set_user_password()`: Set user password in container
- `_set_vnc_password()`: Set VNC password for user
- `_add_ssh_authorized_keys()`: Add SSH keys for user
- `connect_tailscale()`: Connect to Tailscale (for pre-installed Tailscale)
6. **Added `create_desktop_container_fast()`** (`app/services/container_manager.py:1535-1716`)
- Fast path that uses pre-baked image
- Only performs: wait for ready, set passwords, add SSH keys, copy credentials, start VNC, connect Tailscale
7. **Updated `create_desktop_container()`** (`app/services/container_manager.py:1269-1328`)
- Auto-detects pre-baked image and uses fast path when available
- Falls back to fresh install if image not found (configurable)
- Added `force_fresh_install` parameter to bypass pre-baked image
### Test Results:
- E2E lifecycle tests: **4 passed** in 96.20s
- Desktop container creation time: **~67s** (down from ~8 minutes)
- Pre-baked image verified working with VNC, SSH, Claude Code authentication
### Performance Notes:
The target was 60 seconds, but actual time is ~67s. The bottleneck is image unpacking (I/O bound for the 2.7GB image). This is still a massive improvement from ~8 minutes with fresh installation.
### Files Changed:
- `app/config.py` - Added DESKTOP_IMAGE_ALIAS and DESKTOP_IMAGE_FALLBACK settings
- `app/services/container_manager.py` - Added 413 lines for pre-baked image support
- `app/services/incus.py` - Added list_images() method, fixed alias handling
- `scripts/build-desktop-image.sh` - New script to build pre-baked image (260 lines)
### Verification:
Pre-baked image exists at alias 'cdev-desktop', size 2759.55MiB.